Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Feb 15 17:41
    greggirwin closed #98
  • Feb 15 01:31
    gltewalt opened #98
  • Dec 08 2020 18:47
    greggirwin closed #97
  • Dec 07 2020 08:20
    qtxie edited #97
  • Dec 07 2020 08:19
    qtxie opened #97
  • Nov 26 2020 08:58
    dockimbel commented on aed67bb
  • Jun 22 2020 11:44
    qtxie closed #91
  • Jun 22 2020 03:47
    bitbegin opened #91
  • Jun 22 2020 03:31
    qtxie closed #90
  • Jun 22 2020 03:30
    qtxie opened #90
  • May 29 2020 02:36
    bitbegin closed #89
  • May 29 2020 02:34
    bitbegin opened #89
  • May 29 2020 02:32
    bitbegin closed #88
  • May 29 2020 02:32
    bitbegin opened #88
  • May 29 2020 02:32
    bitbegin closed #86
  • May 29 2020 02:30
    qtxie closed #85
  • May 29 2020 02:26
    qtxie closed #87
  • May 29 2020 02:26
    qtxie opened #87
  • May 29 2020 02:17
    bitbegin opened #86
  • May 28 2020 02:50
    bitbegin opened #85
GiuseppeChillemi
@GiuseppeChillemi
@greggirwin You are right but if those features are of marginal use case, it is better to not add a difference and keep the same working. For example, path on a non-existent key in objects led to error while on other data led to none. It's ok for me to keep this as I see it could be important. But using a path to put a value on map seems not so useful.
Gregg Irwin
@greggirwin
One man's marginal use case is another's necessary feature. :^)
GiuseppeChillemi
@GiuseppeChillemi
@greggirwin We are talking about 1M rows.
@greggirwin :) That's the relativeness of things, the R in Rebol.
Boleslav Březovský
@rebolek
Wow, how much memory does it take.
With so many rows I wouldn't want to waste space with unnecessary indexes :-)
Gregg Irwin
@greggirwin
Some of this is, as we admit, implementation details leaking out a bit, which we have to weigh against justifying how you think about things. What helps you to understand versus things you can safely ignore.
GiuseppeChillemi
@GiuseppeChillemi
@rebolek Actually, I process it on SQLServer but I would like to mirror the data set inside Red to process it faster when it will possible.
Gregg Irwin
@greggirwin
~1M is a great size for this discussion. Beyond what I feel I could comfortably brute force, but small enough that I wouldn't want to build a complex infrastructure.
The real catch, though, is how you want and need to query it. This is where there's a huge gap between SQL engines and simpler ISAM models. I want to build the latter into Red, which is the foundation for higher level models.
GiuseppeChillemi
@GiuseppeChillemi
It is 15 years of the company's data: 10 rows per document 6000 documents each year. And you have to backlink them to the Customer and Articles tables.
@greggirwin I would like to experiment. Actually @henrikmk List-View component on Vid-Extension-Kit is able to handle such data using a block of objects structure but I have tried upto 200K entries.
Boleslav Březovský
@rebolek
I have a query dialect for block of maps, but can be easily adapted for a block of blocks too.
GiuseppeChillemi
@GiuseppeChillemi
@rebolek You are doing great things!
I am trying to implement a datatype system on Rebol and port it on Red as soon as the Datagrid component will be ready. I was working on a make-row function and started trying the differences between Rebol and Red, so the reason about this topic.
Boleslav Březovský
@rebolek
@GiuseppeChillemi thanks :)
Gregg Irwin
@greggirwin
When I did an image management system I did a few things. 1) use the file system for the images themselves, 2) build indexes for known query needs, 3) break things down by time. That may not work for you, but by splitting up the data, it became more manageable, and query results could then be combined if needed. Often (again in my case) people only cared about the most recent, and "paging back in time" worked well.
There are always tradeoffs, and no external DB needs in that case, but it scaled to 100M+ images and ~50TB of data in pure R2.
Boleslav Březovský
@rebolek
Nice!
GiuseppeChillemi
@GiuseppeChillemi
@greggirwin Imagine loading in memory those 50TB of data :-)
Oldes Huhuman
@Oldes
Having 1M rows where each row is a map is terrible waste of memory and cpu as well. Even having each row as a small block is a waste as block with 3 values is actually block for 8 values etc.
GiuseppeChillemi
@GiuseppeChillemi
I see only hash and separated headings block as a solution. Do you have any other?
Gregg Irwin
@greggirwin
It's all tradeoffs. With sub-structures you can change them independently, they can vary in structure, etc. With fixed offset values, and no markers at all, changes and direct data viewing become much more difficult.
Gregg Irwin
@greggirwin

If I have 1M records, each with a couple numbers and a few strings, the size in memory, as sub-blocks, may be ~10x the raw data size. A linear search on a value will take ~1s. So it's well within brute force territory for an in-house system. Add a few more tables though, and you'll hit 32-bit Red limits. Then it's weighing programmer time vs machine time vs future needs.

I never want to be needlessly wasteful, but efficiency often comes with associated costs.

Gregg Irwin
@greggirwin

I remember in my VB days how battles raged over the use of the Variant datatype and its comparative inefficiency to base types (and other issues). When I transitioned away from VB I was still in the "Variants Bad" camp. That's when I found Rebol, and saw that it could be worth it. The difference, I think, is that variants didn't add any new expressive power. They solved some problems, but caused others.

Soon enough, in the grand scheme of things, we'll have 64-bit value slots. As with Unicode, or Mac fat binaries, it may look like things will explode in size. What we have to ask is if the ROI on that is worth it. Red is a high level language, but we still have Red/System and types like vector! that can be leveraged where needed.

Memory is ~$5/GB. Are the limits of Moore's Law, and the world at large, finally going to force us to think differently? Maybe not. Brute force just takes us farther, always just enough for many things, so we don't have to. But what if we choose to? What if all your data didn't live in one process, but was split among communicating processes, even locally? Is this "microservices"? Is it better to have one DB daemon that does it all in a 64GB in memory DB or 16x3GB limited processes? 3GB is still a lot of data, just not by today's (wasteful?) standards where data explosion is the norm. I'm all for "we might need it someday" but even I have my limits.

Boleslav Březovský
@rebolek
@Oldes that’s true. In such case flat block and using /skip is best alternative.
GiuseppeChillemi
@GiuseppeChillemi
I will try to profile the differences
GiuseppeChillemi
@GiuseppeChillemi

Well, it seems Red acts better than Rebol here at honoring get value notation.

Rebol:

>> fc: reduce [func [a][]]
== [func [a][]]
>> :fc/1
** Script Error: 1 is missing its a argument
** Where: halt-view
** Near: :fc/1

Red

>> fc: reduce [func [a][]]
== [func [a][]]
>> :fc/1
== func [a][]
>>
hiiamboris
@hiiamboris
keep in mind get 'fc/1 is broken though
GiuseppeChillemi
@GiuseppeChillemi
On Red or Rebol?
hiiamboris
@hiiamboris
nobody breaks Rebol anymore ;)
GiuseppeChillemi
@GiuseppeChillemi
I break it!
hiiamboris
@hiiamboris
haha okay do that ;)
no, it was broken in Red
GiuseppeChillemi
@GiuseppeChillemi
>> get 'fc/1
*** Script Error: fc/1 is missing its a argument
*** Where: get
*** Stack:
it should evaluate to get fc/1
That should not execute the function, should't it?
hiiamboris
@hiiamboris
nope
GiuseppeChillemi
@GiuseppeChillemi
Nope = was I right or wrong?
hiiamboris
@hiiamboris
it should not
that's what I was talking about
Oldes Huhuman
@Oldes
@GiuseppeChillemi in Rebol3 I have:
>> fc: reduce [func [a][]]
== [make function! [[a][]]]

>> type? :fc/1
== function!

>> type? get 'fc/1
== function!
Toomas Vooglaid
@toomasv
In Red path notation is "active", evaluating its target. To access passively use other accessors:
>> fc: reduce [func [a][]]
== [func [a][]]
>> type? first fc
== function!
>> type? pick fc 1
== function!
Greg T
@gltewalt
Yep, path! is an active type
GiuseppeChillemi
@GiuseppeChillemi
Path should not be active if you precede it using '
Petr Krenzelok
@pekr
Spam?
Gregg Irwin
@greggirwin
Deleted.
Boleslav Březovský
@rebolek

@greggirwin I was looking at the Red/code repo, where to put XML functions. See the description from https://github.com/red/code/tree/master/Library :

This is a collection of useful Library functions and modules that can be included in Red programs. Its two sub-sections are Red and Red/System. All library functions and modules have API documentation.

There are actually no subsections. Either the description should be changed or the subsections should be created.

I’ll leave XML-tools in the markup repo for now until we decide how to organize the stuff in code.
Gregg Irwin
@greggirwin
Both red/code and red/community probably need a maintainer to see if the organization makes sense. Subsections make sense to me.