Skip to content

Conversation

anka-213
Copy link
Member

@anka-213 anka-213 commented Nov 29, 2020

This prevents HUGE space leak and makes compiling a PGF a LOT faster

For example, an application grammar moved from taking over 50GB
of ram and taking 5 minutes (most of which is spent on garbage colelction)
to taking 1.2 seconds and using 42mb of memory

The price we pay is that the "variable #n is out of scope" error is now
lazy and will happen when we try to evaluate the term instead of
happening when the function returns and allowing the caller to chose how
to handle the error.
I don't think this should matter in practice, since it's very rare;
at least Inari has never encountered it.

@anka-213 anka-213 force-pushed the make-it-fast branch 2 times, most recently from a17f107 to 2d580d2 Compare November 29, 2020 14:08
@johnjcamilleri
Copy link
Member

This sounds very exciting!

I'm not too sure which error you are referring to, could you provide an example which triggers it? I'm curious to compare the behaviours, even if this case is rare or non-existent in the wild.

Does this have any effect on PGF size? I presume the resulting PGF produced with this modification is different, in the sense that it will have a different md5sum. Has any comparison of old/new PGFs been done with say gftest?

@inariksit
Copy link
Member

@johnjcamilleri In the old version, where value2term returned an Either Int Term, the potential error ("variable #n is out of scope") was caught in 5 places with almost identical behaviour: 1, 2, 3, 4, 5. I've never seen that error in the wild, but maybe @aarneranta or @krangelov can explain when it's supposed to be triggered, so we can test.

It looks to me that the PGFs are identical. I ran gftest on an application grammar as a part of my normal development process (store old PGF compiled with old GF -> implement new feature -> compile with new GF -> gftest the two versions) and got the expected results, that is, only intended changes. I also linearised manually some ad hoc test sentences on the PGF compiled with the new GF, and was happy with them.
Here's a more rigorous test with FoodsEng:

➜ gf -make FoodsEng.gf
Writing Foods.pgf...
➜ mv Foods.pgf /tmp   

➜ gf-old-slow -make FoodsEng.gf
Writing Foods.pgf...

➜ gftest -g Foods.pgf -o /tmp/Foods.pgf 
<TL;DR: identical result>

➜ md5sum Foods.pgf
0e5e30e31e0976cae23884fad2c3b784  Foods.pgf
➜ md5sum /tmp/Foods.pgf
0e5e30e31e0976cae23884fad2c3b784  /tmp/Foods.pgf

@johnjcamilleri
Copy link
Member

Ah, so this is a GF compilation error we're talking about. If the PGFs have the same md5 they are surely identical, no need to use gftest for that 🙃

@anka-213
Copy link
Member Author

anka-213 commented Dec 2, 2020

This change should not change the behaviour in any way other than with respect to performance (and possibly error messages), since all I've done is remove the Either Int wrapping from the function value2term, replaced the Left case with a call to error and replaced all the monadic functions in it with their non-monadic counterparts.

The only ways it could change anything is if some code relies on the error of the Left case being thrown swiftly instead of eventually (or possibly never, if the Term is not fully evaluated). My guess is that the Left case would correspond to a compiler bug and not a user error, but I may be mistaken. If it was user-facing, the UX of "variable #637 is out of scope" would be pretty bad.

@johnjcamilleri
Copy link
Member

Sounds reasonable, Andreas. I would make the same assumption about it indicating a compiler bug.

Since the performance gains are so huge, we should really push this through for inclusion in the next release. But before merging, it would be nice to get the blessing of someone who knows the internals well. Looking at the git history, seems it was @krangelov who introduced the Either Int to value2term here.

@johnjcamilleri
Copy link
Member

I finally got around to testing the version of GF, however I'm disappointed to say that I haven't noticed any improvement at all 😞
I tried two things:

  1. building the RGL to GFOs
  2. compiling a large application grammar (to PGF) which normally takes 90s and 5.56 GB

In both cases I saw identical time and space usage with the old and new GF versions. I'm pretty sure I was not using the same GF executable for both, as I changed the version number in gf.cabal for this version and confirmed with gf --version before each test.

So, any idea why in certain cases this gives no improvement at all? Or maybe I'm doing something wrong?
Can someone else run a test and produce a noticeable improvement?

@inariksit
Copy link
Member

@johnjcamilleri Building the RGL to GFOs uses --no-pmcfg flag, so that's already fast. The changes are in effect when compiling to PGF.

The only grammar I've seen with such a dramatic effect is the one I mentioned in the email to you and Aarne on 29th November at 16:23 (DG address). That particular application grammar only compiles to 300 or so concrete categories, so it's very mysterious how come it even took that long before the fix. I have just synced it to DG's repo, so you can easily find the grammar there.

As for other grammars, I see mostly differences in memory usage. For example, compiling the French RG into a PGF takes still over 2 minutes, but it never uses more than 700MB memory on my computer. Contrast with the old version, where I start compiling and it's almost immediately at 3GB. I'll try some others to see if I find a clear test that is still shorter than 2 minutes. :-P

@inariksit
Copy link
Member

Results for other grammars

Here's a result for the Somali resource grammar:

gf-old-slow -make -v=0 --force-recomp LangSom.gf  72.69s user 1.41s system 99% cpu 1:14.48 total (~1.3GB)
gf          -make -v=0 --force-recomp LangSom.gf  67.21s user 0.83s system 99% cpu 1:08.39 total (~400MB)

Not a huge difference in time, I agree. I had my activity monitor open, and gf-old-slow used 1.3GB memory, gf used around 400 MB.

For Basque RG, the numbers are practically the same, 1.41 GB old vs. 1.37 GB new (tested just once), and both around 24 seconds.

Zulu RG has another pattern, where both versions take around 2GB memory for the bulk of the compilation, and a peak at 6-7 GB, but the new version is quite significantly faster:

gf-old-slow -make -v=0 --force-recomp LangZul.gf  161.49s user 44.77s system 81% cpu 4:13.15 total
gf          -make -v=0 --force-recomp LangZul.gf  156.44s user 34.64s system 87% cpu 3:38.75 total

Wild speculation about the nature of the bug and fix

So if the grammar is actually big, the fix isn't doing miracles. (For example, I still don't manage to produce a PGF of Romanian.) But sometimes GF is unreasonably slow for smaller grammars, in a way that genuinely looks like a bug. Looking at graphs, what seems to improve was the time spent on garbage collection. Here's the before, on that anomalous application grammar:

before

And here's after:
after

So this fix seems to work for an anomalous case. While not as dramatic as we hoped for, it's still an improvement---maybe more accurately described as a bugfix? It's funny that the first grammar on which we tried the fix produced a result of 50GB to 42MB. That certainly coloured our expectations :-P

I've been using this new GF for 11 days now, and so far haven't noticed that anything is going worse than it used to. Of course, it would be nice to have guarantees that it didn't cause any new anomalies, like some Foods grammar suddenly taking 10 minutes.

Copy link
Contributor

@mengwong mengwong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case x of y -> return y
seems like a rather wordy way of saying return x
:)

anka-213 added 3 commits July 12, 2021 15:50
This prevents HUGE space leak and makes compiling a PGF a LOT faster

For example, an application grammar moved from taking over 50GB
of ram and taking 5 minutes (most of which is spent on garbage colelction)
to taking 1.2 seconds and using 42mb of memory

The price we pay is that the "variable #n is out of scope" error is now
lazy and will happen when we try to evaluate the term instead of
happening when the function returns and allowing the caller to chose how
to handle the error.
I don't think this should matter in practice, since it's very rare;
at least Inari has never encountered it.
@inariksit inariksit merged commit 667bfd3 into GrammaticalFramework:master Jul 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants