Skip to content

Conversation

@jainankitk
Copy link
Contributor

@jainankitk jainankitk commented Jul 22, 2024

Description

Delegating the matches in PointRangeQuery weight to relate method

Issue

Resolves #13598

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this change to result in a slow down for this type of queries.

You are proposing to replace the current implementation with a slower one that computes the relationship between a range and the query. Note that this method is the hottest in the execution path.

Have you done in performance benchmark of the change?

@harshavamsi
Copy link

I would expect this change to result in a slow down for this type of queries.

You are proposing to replace the current implementation with a slower one that computes the relationship between a range and the query. Note that this method is the hottest in the execution path.

Have you done in performance benchmark of the change?

@iverase i'm struggling to see how this could be worse, it looks like we are simply delegating from matches to relate to simplify the logic, relate should return the cell attribute after comparing the values quickly. what am I missing?

@iverase
Copy link
Contributor

iverase commented Jul 24, 2024

If we look at the current implementation of matches and relates, they both iterate over the dimensions and they both check if the dimension is disjoint. If that is true, then they bail out. The difference is when the dimensions are not disjoint because the relate method executes another check to compute if the dimensions crosses or is fully inside the range:

 crosses |=
              comparator.compare(minPackedValue, offset, lowerPoint, offset) < 0
                  || comparator.compare(maxPackedValue, offset, upperPoint, offset) > 0;

This is the extra check your change is adding which IMHO is not a good idea.

@jainankitk
Copy link
Contributor Author

jainankitk commented Jul 25, 2024

@iverase - Adding the benchmark results below:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value                                                 
                 LowSloppyPhrase       20.37      (6.5%)       19.87      (5.4%)   -2.5% ( -13% -   10%) 0.195                                                              
                      AndHighMed      145.52      (4.9%)      142.07      (4.7%)   -2.4% ( -11% -    7%) 0.121                                                              
               HighTermTitleSort       61.32      (5.9%)       59.98      (7.5%)   -2.2% ( -14% -   11%) 0.305                                                              
                HighSloppyPhrase       17.92      (5.1%)       17.53      (5.2%)   -2.2% ( -11% -    8%) 0.185                                                              
            HighTermTitleBDVSort       20.04      (3.7%)       19.69      (4.4%)   -1.7% (  -9% -    6%) 0.180                                                              
                      AndHighLow      898.04      (4.6%)      882.94      (3.4%)   -1.7% (  -9% -    6%) 0.189                                                              
         AndHighMedDayTaxoFacets       35.31      (5.6%)       34.80      (5.1%)   -1.4% ( -11% -    9%) 0.398                                                              
             LowIntervalsOrdered       11.22      (3.6%)       11.06      (3.8%)   -1.4% (  -8% -    6%) 0.223                                                              
                       OrHighMed      170.26      (3.1%)      168.45      (4.0%)   -1.1% (  -7% -    6%) 0.349                                                              
                        Wildcard       39.53      (4.9%)       39.18      (5.2%)   -0.9% ( -10% -    9%) 0.576                                                              
                 MedSloppyPhrase        9.47      (4.1%)        9.39      (6.0%)   -0.9% ( -10% -    9%) 0.596                                                              
                       OrHighLow      362.38      (3.2%)      359.34      (3.0%)   -0.8% (  -6% -    5%) 0.393                                                              
                     AndHighHigh       79.94      (6.3%)       79.33      (6.4%)   -0.8% ( -12% -   12%) 0.707                                                              
            HighIntervalsOrdered       25.47     (13.4%)       25.28     (10.8%)   -0.7% ( -21% -   27%) 0.846                                                              
            MedTermDayTaxoFacets        5.40      (5.0%)        5.36      (7.1%)   -0.7% ( -12% -   11%) 0.703                                                              
       BrowseDayOfYearTaxoFacets        2.41      (7.9%)        2.39      (8.1%)   -0.7% ( -15% -   16%) 0.790                                                              
           HighTermDayOfYearSort      326.80      (5.0%)      324.76      (4.8%)   -0.6% (  -9% -    9%) 0.688                                                              
                   OrHighNotHigh      190.76      (5.9%)      189.84      (6.5%)   -0.5% ( -12% -   12%) 0.806                                                              
                       MedPhrase       57.90      (8.3%)       57.75      (5.7%)   -0.3% ( -13% -   14%) 0.904                                                              
                      HighPhrase       22.11      (7.2%)       22.06      (7.5%)   -0.2% ( -13% -   15%) 0.922                                                              
     BrowseRandomLabelTaxoFacets        1.95      (3.6%)        1.94      (4.8%)   -0.2% (  -8% -    8%) 0.891                                                              
                      OrHighHigh       50.72     (10.3%)       50.66     (10.0%)   -0.1% ( -18% -   22%) 0.969                                                              
                     LowSpanNear       23.16      (6.6%)       23.14      (3.4%)   -0.1% (  -9% -   10%) 0.958                                                              
        AndHighHighDayTaxoFacets        5.18      (3.0%)        5.17      (4.6%)   -0.1% (  -7% -    7%) 0.954                                                              
            BrowseDateSSDVFacets        0.57      (7.6%)        0.57     (11.0%)   -0.0% ( -17% -   20%) 0.993                                                              
                         LowTerm      555.76      (2.7%)      556.05      (4.5%)    0.1% (  -6% -    7%) 0.964                                                              
                    OrNotHighMed      207.86      (4.7%)      207.97      (6.3%)    0.1% ( -10% -   11%) 0.976                                                              
                    OrHighNotLow      253.44      (5.9%)      253.66      (7.2%)    0.1% ( -12% -   14%) 0.967                                                              
       BrowseDayOfYearSSDVFacets        3.05      (9.4%)        3.06      (4.6%)    0.1% ( -12% -   15%) 0.968                                                              
           BrowseMonthTaxoFacets        2.52     (10.6%)        2.52     (12.0%)    0.1% ( -20% -   25%) 0.979                                                              
            BrowseDateTaxoFacets        2.33      (7.0%)        2.33      (8.0%)    0.4% ( -13% -   16%) 0.865                                                              
             MedIntervalsOrdered       11.10      (5.2%)       11.16      (6.4%)    0.5% ( -10% -   12%) 0.769                                                              
                     MedSpanNear       16.02      (5.7%)       16.11      (4.5%)    0.6% (  -9% -   11%) 0.721                                                              
          OrHighMedDayTaxoFacets        2.41      (3.5%)        2.42      (9.5%)    0.6% ( -11% -   14%) 0.785                                                              
                          IntNRQ       49.96      (5.7%)       50.29      (5.8%)    0.7% ( -10% -   12%) 0.719                                                              
                         Respell       31.71      (7.9%)       32.01      (7.1%)    0.9% ( -13% -   17%) 0.690                                                              
                       LowPhrase       71.67      (5.2%)       72.41      (4.0%)    1.0% (  -7% -   10%) 0.480                                                              
           BrowseMonthSSDVFacets        3.15      (6.9%)        3.18      (8.3%)    1.1% ( -13% -   17%) 0.663                                                              
                    OrNotHighLow      574.19      (3.1%)      580.93      (3.5%)    1.2% (  -5% -    8%) 0.263                                                              
                        HighTerm      264.24      (5.1%)      267.41      (7.0%)    1.2% ( -10% -   14%) 0.536                                                              
                    HighSpanNear        2.09      (3.2%)        2.11      (3.6%)    1.3% (  -5% -    8%) 0.211                                                              
                    OrHighNotMed      209.51      (6.3%)      212.50      (7.6%)    1.4% ( -11% -   16%) 0.516                                                              
                        PKLookup      125.82      (9.8%)      127.63     (10.0%)    1.4% ( -16% -   23%) 0.647
                   OrNotHighHigh      216.12      (5.4%)      219.47      (5.7%)    1.5% (  -9% -   13%) 0.379
     BrowseRandomLabelSSDVFacets        2.10      (5.8%)        2.13      (9.0%)    1.6% ( -12% -   17%) 0.505                                                                               
               HighTermMonthSort     1076.48      (5.3%)     1094.22      (4.0%)    1.6% (  -7% -   11%) 0.270                                                                               
                          Fuzzy1       42.89      (6.7%)       43.71      (6.4%)    1.9% ( -10% -   16%) 0.358                                                                               
                         Prefix3     1053.01      (6.3%)     1074.45      (6.6%)    2.0% ( -10% -   15%) 0.320                                                                               
                          Fuzzy2       18.52      (5.2%)       18.91      (3.8%)    2.1% (  -6% -   11%) 0.147                                                                               
                         MedTerm      386.62      (4.5%)      398.80      (6.1%)    3.1% (  -7% -   14%) 0.063                                                                                                    
                      TermDTSort       93.06      (6.7%)       96.18      (7.5%)    3.3% ( -10% -   18%) 0.137

@jainankitk
Copy link
Contributor Author

Also, adding the cpu/memory profile below:

BASELINE:

PROFILE SUMMARY from 1922337 events (total: 1M)                                                                                                                                                                   
  tests.profile.mode=cpu                                                                                                                                                                                          
  tests.profile.count=30                                                                                                                                                                                          
  tests.profile.stacksize=1                                                                                                                                                                                       
  tests.profile.linenumbers=false                                                                                                                                                                                 
PERCENT       CPU SAMPLES   STACK                   
10.59%        203591        org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegmentNHLD()                                                                                                                           
6.29%         120862        org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()                                                                                                                                             
4.20%         80750         jdk.internal.foreign.MemorySessionImpl#checkValidStateRaw()                                                                                                                                                     
3.93%         75500         org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get()                                                                                                                                           
3.35%         64461         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25#advance()                                                                                                                                        
3.20%         61497         org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()                                                                                                                                           
3.03%         58227         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$EverythingEnum#nextPosition()                                                                                                                          
2.85%         54804         org.apache.lucene.util.packed.DirectMonotonicReader#get()                                                                                                                             
2.30%         44167         org.apache.lucene.queries.spans.SpanScorer#setFreqCurrentDoc()                                                                                                                        
2.26%         43357         org.apache.lucene.queries.spans.NearSpansOrdered#stretchToOrder()                                                                                                                     
2.10%         40401         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$EverythingEnum#advance()                                                                                                     
1.97%         37865         jdk.internal.foreign.AbstractMemorySegmentImpl#checkBounds()                                                                                                                          
1.72%         33062         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue()                                                                                                             
1.61%         30925         org.apache.lucene.queries.spans.TermSpans#nextStartPosition()                                                                                                                         
1.44%         27706         org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()                                                                                                               
1.40%         26894         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$VaryingBPVReader#getLongValue()                                                                                           
1.40%         26835         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25#nextOrd()                                                                                                              
1.33%         25573         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$BlockImpactsPostingsEnum#advance()                                                                                           
1.18%         22779         org.apache.lucene.util.packed.DirectReader$DirectPackedReader4#get()                                                                                                                  
1.15%         22022         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$EverythingEnum#skipPositions()                                                                                               
1.10%         21218         org.apache.lucene.search.ConjunctionDISI#doNext()                                                                                                                                     
1.09%         21032         org.apache.lucene.queries.spans.NearSpansOrdered#nextStartPosition()                                                                                                                  
1.07%         20582         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$4#longValue()                                                                                                             
1.03%         19772         java.lang.invoke.VarHandleLongs$FieldInstanceReadWrite#weakCompareAndSetRelease()                                                                                                     
0.98%         18791         org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl#readShort()                                                                                                         
0.97%         18647         org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue()                                                                                                                 
0.91%         17504         org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1#collect()                                                                                                  
0.86%         16506         org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score()                                                                                                               
0.82%         15742         org.apache.lucene.codecs.lucene99.ForUtil#expand8()                          
0.77%         14792         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$DenseNumericDocValues#nextDoc()  

CANDIDATE:

PROFILE SUMMARY from 1921099 events (total: 1M)                                                                                                                                                                                             
  tests.profile.mode=cpu                                                                                                                                                                                                                    
  tests.profile.count=30                                                                                                                                                                                                                    
  tests.profile.stacksize=1                                                                                                                                                                                                                 
  tests.profile.linenumbers=false                                                                                                                                                                                                           
PERCENT       CPU SAMPLES   STACK                                                                                                                                                                                                           
10.33%        198404        org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegmentNHLD()                                                                                                                           
6.29%         120771        org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()                                                                                                                                             
4.25%         81662         jdk.internal.foreign.MemorySessionImpl#checkValidStateRaw()                                                                                                                                                     
3.94%         75673         org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get()                                                                                                                                           
3.41%         65587         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25#advance()                                                                                                                                        
3.28%         62964         org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()                                                                                                                                           
3.01%         57880         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$EverythingEnum#nextPosition()                                                                                                                          
2.88%         55269         org.apache.lucene.util.packed.DirectMonotonicReader#get()                                                                                                                                                       
2.23%         42912         org.apache.lucene.queries.spans.SpanScorer#setFreqCurrentDoc()                                                                                                                                                  
2.23%         42912         org.apache.lucene.queries.spans.NearSpansOrdered#stretchToOrder()                                                                                                                                               
2.10%         40261         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$EverythingEnum#advance()                                                                                                                               
1.95%         37519         jdk.internal.foreign.AbstractMemorySegmentImpl#checkBounds()                                                                                                                                                    
1.66%         31838         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue()                                                                                                                                       
1.59%         30631         org.apache.lucene.queries.spans.TermSpans#nextStartPosition()                                                                                                                                                   
1.43%         27509         org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()                                                                                                               
1.40%         26944         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$VaryingBPVReader#getLongValue()                                                                                                                     
1.35%         25993         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$BlockImpactsPostingsEnum#advance()                                                                                                                     
1.34%         25730         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25#nextOrd()                                                                                                                                        
1.23%         23545         org.apache.lucene.util.packed.DirectReader$DirectPackedReader4#get()                                                                                                                                            
1.18%         22707         org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$EverythingEnum#skipPositions()                                                                                                                         
1.17%         22386         org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$4#longValue()                                                                                                                                       
1.08%         20767         org.apache.lucene.search.ConjunctionDISI#doNext()                                                                                                                                                               
1.04%         19986         org.apache.lucene.queries.spans.NearSpansOrdered#nextStartPosition()                                                                                                                                            
1.01%         19463         java.lang.invoke.VarHandleLongs$FieldInstanceReadWrite#weakCompareAndSetRelease()                                                                                                                               
1.00%         19172         org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl#readShort()                                                                                                                                   
0.97%         18581         org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue()                                                                                                                                           
0.87%         16701         org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score()                                                                                                                                         
0.87%         16684         org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1#collect()
0.83%         15930         org.apache.lucene.codecs.lucene99.ForUtil#expand8()                                                                                             
0.79%         15103         org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl#readInt() 

@jainankitk
Copy link
Contributor Author

Memory profile:
BASELINE:

PROFILE SUMMARY from 78943 events (total: 98729M)                                                                     
  tests.profile.mode=heap                                  
  tests.profile.count=30                                   
  tests.profile.stacksize=1                                
  tests.profile.linenumbers=false                          
PERCENT       HEAP SAMPLES  STACK                          
12.66%        12502M        org.apache.lucene.facet.taxonomy.TaxonomyFacets#initializeValueCounters()                                                                                                                                       
10.17%        10044M        org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$BlockDocsEnum#<init>()                                                                                                                                 
6.61%         6523M         org.apache.lucene.util.ArrayUtil#growExact()                                              
6.32%         6241M         org.apache.lucene.util.FixedBitSet#<init>()                                               
4.79%         4726M         org.apache.lucene.util.ArrayUtil#growNoCopy()                                             
4.58%         4524M         org.apache.lucene.facet.FacetsConfig#stringToPath()                                       
4.38%         4326M         org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegmentNHLD()                                                                                                                           
3.93%         3882M         java.lang.StringUTF16#compress()                                                          
3.13%         3085M         org.apache.lucene.util.BytesRef#utf8ToString()                                            
2.67%         2637M         org.apache.lucene.codecs.lucene99.ForUtil#<init>()                                        
2.53%         2499M         org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#initializeCounts()                                                                                                                              
2.35%         2319M         java.util.ArrayList#grow()                                                                
2.32%         2292M         org.apache.lucene.search.MaxScoreAccumulator#get()                                        
1.95%         1926M         org.apache.lucene.search.ExactPhraseMatcher$1$1#getImpacts()                                                                                                                                                    
1.76%         1740M         java.util.AbstractList#iterator()                                                         
1.71%         1685M         org.apache.lucene.util.DocIdSetBuilder$Buffer#<init>()                                    
1.70%         1676M         org.apache.lucene.codecs.lucene90.blocktree.SegmentTermsEnumFrame#<init>()                                                                                                                                      
1.52%         1497M         org.apache.lucene.util.BytesRef#<init>()                                                  
1.36%         1338M         org.apache.lucene.util.fst.ByteSequenceOutputs#read()                                     
1.18%         1168M         java.util.ArrayList#iterator()                                                            
0.97%         961M          org.apache.lucene.search.ExactPhraseMatcher$1#getImpacts()                                                                                                                                                      
0.90%         884M          java.lang.Long#valueOf()                                                                  
0.81%         798M          jdk.internal.misc.Unsafe#allocateUninitializedArray()                                     
0.76%         752M          java.util.ArrayList#toArray()                                                             
0.73%         719M          jdk.internal.foreign.MappedMemorySegmentImpl#dup()                                        
0.72%         714M          java.util.concurrent.locks.AbstractQueuedSynchronizer#acquire()                                                                                                                                                 
0.71%         696M          java.util.Arrays#asList()                                                                 
0.63%         619M          java.lang.reflect.Array#newInstance()                                                     
0.60%         593M          org.apache.lucene.store.MemorySegmentIndexInput#buildSlice()                                                                                                                                                    
0.60%         593M          org.apache.lucene.search.BooleanScorer#<init>() 

CANDIDATE:

PROFILE SUMMARY from 78928 events (total: 98582M)                                                                     
  tests.profile.mode=heap                                  
  tests.profile.count=30                                   
  tests.profile.stacksize=1                                
  tests.profile.linenumbers=false                          
PERCENT       HEAP SAMPLES  STACK                          
12.68%        12502M        org.apache.lucene.facet.taxonomy.TaxonomyFacets#initializeValueCounters()                                                                                                                                       
10.16%        10013M        org.apache.lucene.codecs.lucene99.Lucene99PostingsReader$BlockDocsEnum#<init>()                                                                                                                                 
6.32%         6231M         org.apache.lucene.util.FixedBitSet#<init>()                                               
6.24%         6155M         org.apache.lucene.util.ArrayUtil#growExact()                                              
4.97%         4896M         org.apache.lucene.facet.FacetsConfig#stringToPath()                                       
4.78%         4711M         org.apache.lucene.util.ArrayUtil#growNoCopy()                                             
4.40%         4336M         org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegmentNHLD()                                                                                                                           
3.74%         3686M         java.lang.StringUTF16#compress()                                                          
3.21%         3162M         org.apache.lucene.util.BytesRef#utf8ToString()                                            
2.70%         2665M         org.apache.lucene.codecs.lucene99.ForUtil#<init>()                                        
2.54%         2500M         org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#initializeCounts()                                                                                                                              
2.36%         2322M         org.apache.lucene.search.MaxScoreAccumulator#get()                                        
2.18%         2148M         java.util.ArrayList#grow()                                                                
1.98%         1951M         org.apache.lucene.search.ExactPhraseMatcher$1$1#getImpacts()                                                                                                                                                    
1.79%         1768M         org.apache.lucene.codecs.lucene90.blocktree.SegmentTermsEnumFrame#<init>()                                                                                                                                      
1.77%         1741M         org.apache.lucene.util.DocIdSetBuilder$Buffer#<init>()                                    
1.65%         1623M         java.util.AbstractList#iterator()                                                         
1.51%         1491M         org.apache.lucene.util.BytesRef#<init>()                                                  
1.41%         1388M         org.apache.lucene.util.fst.ByteSequenceOutputs#read()                                     
1.11%         1095M         java.util.ArrayList#iterator()                                                            
1.10%         1080M         jdk.internal.misc.Unsafe#allocateUninitializedArray()                                     
0.96%         943M          org.apache.lucene.search.ExactPhraseMatcher$1#getImpacts()                                                                                                                                                      
0.84%         832M          java.util.ArrayList#toArray()                                                             
0.83%         821M          java.lang.Long#valueOf()                                                                  
0.79%         777M          jdk.internal.foreign.MappedMemorySegmentImpl#dup()                                        
0.75%         743M          java.util.Arrays#asList()                                                                 
0.71%         702M          java.util.concurrent.locks.AbstractQueuedSynchronizer#acquire()                                                                                                                                                 
0.65%         639M          org.apache.lucene.store.MemorySegmentIndexInput#buildSlice()                                                                                                                                                    
0.60%         594M          java.lang.reflect.Array#newInstance()                                                     
0.60%         589M          org.apache.lucene.search.BooleanScorer#<init>() 

@jainankitk
Copy link
Contributor Author

Overall, I am not seeing any performance regression from the benchmark or memory/cpu profiles. Please let me know, in case I am missing something.

@iverase
Copy link
Contributor

iverase commented Jul 26, 2024

I am afraid those benchmarks hardly exercise the change you are proposing so they will not show anything.

@jainankitk
Copy link
Contributor Author

I am afraid those benchmarks hardly exercise the change you are proposing so they will not show anything.

Do you have anything specific in mind?

@iverase
Copy link
Contributor

iverase commented Jul 26, 2024

what do you think of my comment above, would you agree that this change makes the matches method more expensive?

I think what you propose is anti-pattern for the IntersectsVisitor API. The point of having two methods is that computing relates is in general more expensive than compute matches and therefore matches should never call relates.

@jainankitk
Copy link
Contributor Author

what do you think of my comment above, would you agree that this change makes the matches method more expensive?

Probably. I have tweaked the code to avoid the below check. Let me know if you still feel it becomes expensive.

 crosses |=
              comparator.compare(minPackedValue, offset, lowerPoint, offset) < 0
                  || comparator.compare(maxPackedValue, offset, upperPoint, offset) > 0;

I think what you propose is anti-pattern for the IntersectsVisitor API. The point of having two methods is that computing relates is in general more expensive than compute matches and therefore matches should never call relates.

IMHO, relates and matches can be same method given we don't resolve the relation further when it intersects.


private Relation relate(byte[] minPackedValue, byte[] maxPackedValue) {
private Relation relateHelper(
byte[] minPackedValue, byte[] maxPackedValue, boolean needCrossOrInside) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of this indirection with boolean needCrossOrInside. It makes the code less readable and doesn't seem to remove a whole lot of redundant code.. I'd rather we just add a comment that explains the extra check we do in relate().

I'm not deeply familiar with this part of Lucene, but it seems like the checks here can have some non-obvious overheads? What would be a good benchmark to surface them? Is it possible to extend an existing benchmark?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not completely convinced about the readability part, since redundancy requires changing code at multiple places. Maybe there is more readable version of this code, that I am unable to think of. But I do see your point about non-obvious overheads. Initially, I assumed that method inlining should be able to mitigate this overhead, but the default size limit for inlining is fairly small 35 bytes.

% java -XX:+PrintFlagsFinal -version | grep MaxInlineSize
     intx C1MaxInlineSize                          = 35                                     {C1 product} {default}
     intx MaxInlineSize                            = 35                                     {C2 product} {default}
openjdk version "21.0.4" 2024-07-16 LTS

Signed-off-by: Ankit Jain <[email protected]>
@jainankitk jainankitk reopened this Aug 6, 2024
Signed-off-by: Ankit Jain <[email protected]>
@gsmiller
Copy link
Contributor

gsmiller commented Aug 6, 2024

In general, I really appreciate that you're looking for opportunities to cleanup the codebase and find ways to avoid duplicated logic. Thanks @jainankitk ! At the same time, I don't personally agree with making this change. I do like that you've found a way to avoid the unnecessary work (nice improvement from the original approach that fully delegated to #relate), but I'm not sure this change actually solves a meaningful problem and I think it's a little clunky for the reader at the same time (it's a bit strange to have to pass the same packed value point twice, trying to make the abstraction for relating two ranges work with a point). If this logic was used in a lot of places and was a bit more involved, I might be convinced. But as it is, I'm not personally in favor of moving forward with this tweak. That's my opinion at least.

@jainankitk
Copy link
Contributor Author

I think it's a little clunky for the reader at the same time (it's a bit strange to have to pass the same packed value point twice

Thanks @gsmiller for providing this feedback. It also seemed bit odd to be passing same packed value point twice, but still wanted to get opinion on it. After removing the clunky parts, very small change remaining that probably makes sense. Would have liked to not make the change at 2 places, but it seems there is no good way.

Just for context, I came across this code path while reviewing the changes for ApproximateRangeQuery in Opensearch. I am wondering if we can make some other changes in PointRangeQuery for not having to duplicate lot of code for ApproximateRangeQuery. For example - make the anonymous ConstantScoreWeight class in the createWeight method to named one allowing ApproximateRangeQuery to only override the specific methods

@gsmiller
Copy link
Contributor

gsmiller commented Aug 9, 2024

@jainankitk thanks for the iterations! I'm fine with making this change as you currently having. I'll get it merged. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove redundant code in PointRangeQuery Weight

5 participants