/ Changes On Branch in-scan-vs-index
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Changes In Branch in-scan-vs-index Excluding Merge-Ins

This is equivalent to a diff from 43f7ddad80 to dc4172e6b8

2020-09-30
18:06
Improved query optimization for multi-column indexes where the second or later columns are constrained by an IN operator and the earlier index columns limit the search to a small number of rows. Use the new OP_SeekScan opcode which does scanning of the relevant range of the index but gives up and falls back to doing a seek if the number of rows scanned grows to large, in order to guard against pathological cases where the estimated number of rows to be scanned is far too small. (check-in: 4a43430fd2 user: drh tags: trunk)
18:03
For the OP_SeekScan opcode, adjust the number of steps run before giving up based on the estimated number of comparisons needed to perform a seek. (Closed-Leaf check-in: dc4172e6b8 user: drh tags: in-scan-vs-index)
09:17
Better understanding of savepoint code (check-in: ce6d8d6215 user: shearer tags: trunk)
00:48
Add an sqlite3FaultSim() call to btreeNext() to make it easier to simulate I/O errors in calls to sqlite3BtreeNext(), and in OP_SeekScan. (check-in: 29cca775d3 user: drh tags: in-scan-vs-index)
2020-09-28
19:51
Revisiting the IN-scan optimization to try to fix it for the corner case where the statistics deceive the query planner into using a scan when an indexed lookup would be better. This check-in changes the code generation to do the IN-scan using a new OP_SeekScan opcode. That new opcode is designed to abandon the scan and fall back to a seek if it doesn't find a match quickly enough. For this work-in-progress check-in, OP_SeekScan is still a no-op and OP_SeekGE still ends up doing all the work. (check-in: d720b6981e user: drh tags: in-scan-vs-index)
15:49
Small performance improvement and size reduction by reducing the size of the WhereTerm object. (check-in: 43f7ddad80 user: drh tags: trunk)
13:34
Avoid the possibility of integer overflow on the --pagecache option to the CLI. See [forum:10a2892377|forum post 10a2892377] (check-in: d3d13df31a user: drh tags: trunk)

Changes to src/btree.c.

  5746   5746         pCur->eState = CURSOR_VALID;
  5747   5747         if( pCur->skipNext>0 ) return SQLITE_OK;
  5748   5748       }
  5749   5749     }
  5750   5750   
  5751   5751     pPage = pCur->pPage;
  5752   5752     idx = ++pCur->ix;
  5753         -  if( !pPage->isInit ){
         5753  +  if( !pPage->isInit || sqlite3FaultSim(412) ){
  5754   5754       /* The only known way for this to happen is for there to be a
  5755   5755       ** recursive SQL function that does a DELETE operation as part of a
  5756   5756       ** SELECT which deletes content out from under an active cursor
  5757   5757       ** in a corrupt database file where the table being DELETE-ed from
  5758   5758       ** has pages in common with the table being queried.  See TH3
  5759   5759       ** module cov1/btree78.test testcase 220 (2018-06-08) for an
  5760   5760       ** example. */

Changes to src/vdbe.c.

  4378   4378       goto jump_to_p2;
  4379   4379     }else if( eqOnly ){
  4380   4380       assert( pOp[1].opcode==OP_IdxLT || pOp[1].opcode==OP_IdxGT );
  4381   4381       pOp++; /* Skip the OP_IdxLt or OP_IdxGT that follows */
  4382   4382     }
  4383   4383     break;
  4384   4384   }
         4385  +
         4386  +
         4387  +/* Opcode: SeekScan  P1 * * * *
         4388  +** Synopsis: Scan-ahead up to P1 rows
         4389  +**
         4390  +** This opcode is a prefix opcode to OP_SeekGE.  In other words, this
         4391  +** opcode must be immediately followed by OP_SeekGE.  Furthermore, the
         4392  +** OP_SeekGE must be followed by OP_IdxGT.  These constraints are
         4393  +** checked by assert() statements.
         4394  +**
         4395  +** This opcode uses the P1 through P4 operands of the subsequent
         4396  +** OP_SeekGE.  In the text that follows, the operands of the subsequent
         4397  +** OP_SeekGE opcode are denoted as SeekOP.P1 through SeekOP.P4.   Only
         4398  +** the P1 operand of this opcode is used, and it is denoted as This.P1.
         4399  +**
         4400  +** This opcode helps to optimize IN operators on a multi-column index
         4401  +** where the IN operator is on the later terms of the index by avoiding
         4402  +** unnecessary seeks on the btree, substituting steps to the next row
         4403  +** of the b-tree instead.  A correct answer is obtained if this opcode
         4404  +** is omitted or is a no-op.
         4405  +**
         4406  +** The SeekGE.P3 and SeekGE.P4 operands identify an unpacked key which
         4407  +** is the desired entry that we want the cursor SeekGE.P1 to be pointing
         4408  +** to.  Call this SeekGE.P4/P5 row the "target".
         4409  +**
         4410  +** If the SeekGE.P1 cursor is not currently pointing to a valid row,
         4411  +** then this opcode is a no-op and control passes through into the OP_SeekGE.
         4412  +**
         4413  +** If the SeekGE.P1 cursor is pointing to a valid row, then that row
         4414  +** might be the target row, or it might be near and slightly before the
         4415  +** target row.  This opcode attempts to position the cursor on the target
         4416  +** row by, perhaps stepping by invoking sqlite3BtreeStep() on the cursor
         4417  +** between 0 and This.P1 times.
         4418  +**
         4419  +** There are three possible outcomes from this opcode:<ol>
         4420  +**
         4421  +** <li> If after This.P1 steps, the cursor is still point to a place that
         4422  +**      is earlier in the btree than the target row,
         4423  +**      then fall through into the subsquence OP_SeekGE opcode.
         4424  +**
         4425  +** <li> If the cursor is successfully moved to the target row by 0 or more
         4426  +**      sqlite3BtreeNext() calls, then jump to the first instruction after the
         4427  +**      OP_IdxGT opcode - or in other words, skip the next two opcodes.
         4428  +**
         4429  +** <li> If the cursor ends up past the target row (indicating the the target
         4430  +**      row does not exist in the btree) then jump to SeekOP.P2. 
         4431  +** </ol>
         4432  +*/
         4433  +case OP_SeekScan: {
         4434  +  VdbeCursor *pC;
         4435  +  int res;
         4436  +  int n;
         4437  +  UnpackedRecord r;
         4438  +
         4439  +  assert( pOp[1].opcode==OP_SeekGE );
         4440  +  assert( pOp[2].opcode==OP_IdxGT );
         4441  +  assert( pOp[1].p1==pOp[2].p1 );
         4442  +  assert( pOp[1].p2==pOp[2].p2 );
         4443  +  assert( pOp[1].p3==pOp[2].p3 );
         4444  +  assert( pOp[1].p4.i==pOp[2].p4.i );
         4445  +  assert( pOp->p1>0 );
         4446  +  pC = p->apCsr[pOp[1].p1];
         4447  +  assert( pC!=0 );
         4448  +  assert( pC->eCurType==CURTYPE_BTREE );
         4449  +  assert( !pC->isTable );
         4450  +  if( !sqlite3BtreeCursorIsValidNN(pC->uc.pCursor) ){
         4451  +#ifdef SQLITE_DEBUG
         4452  +     if( db->flags&SQLITE_VdbeTrace ){
         4453  +       printf("... cursor not valid - fall through\n");
         4454  +     }        
         4455  +#endif
         4456  +    break;
         4457  +  }
         4458  +  n = pOp->p1;
         4459  +  assert( n>=1 );
         4460  +  r.pKeyInfo = pC->pKeyInfo;
         4461  +  r.nField = (u16)pOp[1].p4.i;
         4462  +  r.default_rc = 0;
         4463  +  r.aMem = &aMem[pOp[1].p3];
         4464  +#ifdef SQLITE_DEBUG
         4465  +  {
         4466  +    int i;
         4467  +    for(i=0; i<r.nField; i++){
         4468  +      assert( memIsValid(&r.aMem[i]) );
         4469  +      REGISTER_TRACE(pOp[1].p3+i, &aMem[pOp[1].p3+i]);
         4470  +    }
         4471  +  }
         4472  +#endif
         4473  +  res = 0;  /* Not needed.  Only used to silence a warning. */
         4474  +  while(1){
         4475  +    rc = sqlite3VdbeIdxKeyCompare(db, pC, &r, &res);
         4476  +    if( rc ) goto abort_due_to_error;
         4477  +    if( res>0 ){
         4478  +      seekscan_search_fail:
         4479  +#ifdef SQLITE_DEBUG
         4480  +      if( db->flags&SQLITE_VdbeTrace ){
         4481  +        printf("... %d steps and then skip\n", pOp->p1 - n);
         4482  +      }        
         4483  +#endif
         4484  +      VdbeBranchTaken(1,3);
         4485  +      pOp++;
         4486  +      goto jump_to_p2;
         4487  +    }
         4488  +    if( res==0 ){
         4489  +#ifdef SQLITE_DEBUG
         4490  +      if( db->flags&SQLITE_VdbeTrace ){
         4491  +        printf("... %d steps and then success\n", pOp->p1 - n);
         4492  +      }        
         4493  +#endif
         4494  +      VdbeBranchTaken(2,3);
         4495  +      pOp += 2;
         4496  +      break;
         4497  +    }
         4498  +    if( n<=0 ){
         4499  +#ifdef SQLITE_DEBUG
         4500  +      if( db->flags&SQLITE_VdbeTrace ){
         4501  +        printf("... fall through after %d steps\n", pOp->p1);
         4502  +      }        
         4503  +#endif
         4504  +      VdbeBranchTaken(0,3);
         4505  +      break;
         4506  +    }
         4507  +    n--;
         4508  +    rc = sqlite3BtreeNext(pC->uc.pCursor, 0);
         4509  +    if( rc ){
         4510  +      if( rc==SQLITE_DONE ){
         4511  +        rc = SQLITE_OK;
         4512  +        goto seekscan_search_fail;
         4513  +      }else{
         4514  +        goto abort_due_to_error;
         4515  +      }
         4516  +    }
         4517  +  }
         4518  +  
         4519  +  break;
         4520  +}
         4521  +
  4385   4522   
  4386   4523   /* Opcode: SeekHit P1 P2 P3 * *
  4387   4524   ** Synopsis: set P2<=seekHit<=P3
  4388   4525   **
  4389   4526   ** Increase or decrease the seekHit value for cursor P1, if necessary,
  4390   4527   ** so that it is no less than P2 and no greater than P3.
  4391   4528   **
................................................................................
  5910   6047       int i;
  5911   6048       for(i=0; i<r.nField; i++){
  5912   6049         assert( memIsValid(&r.aMem[i]) );
  5913   6050         REGISTER_TRACE(pOp->p3+i, &aMem[pOp->p3+i]);
  5914   6051       }
  5915   6052     }
  5916   6053   #endif
  5917         -  res = 0;  /* Not needed.  Only used to silence a warning. */
  5918         -  rc = sqlite3VdbeIdxKeyCompare(db, pC, &r, &res);
         6054  +
         6055  +  /* Inlined version of sqlite3VdbeIdxKeyCompare() */
         6056  +  {
         6057  +    i64 nCellKey = 0;
         6058  +    BtCursor *pCur;
         6059  +    Mem m;
         6060  +
         6061  +    assert( pC->eCurType==CURTYPE_BTREE );
         6062  +    pCur = pC->uc.pCursor;
         6063  +    assert( sqlite3BtreeCursorIsValid(pCur) );
         6064  +    nCellKey = sqlite3BtreePayloadSize(pCur);
         6065  +    /* nCellKey will always be between 0 and 0xffffffff because of the way
         6066  +    ** that btreeParseCellPtr() and sqlite3GetVarint32() are implemented */
         6067  +    if( nCellKey<=0 || nCellKey>0x7fffffff ){
         6068  +      rc = SQLITE_CORRUPT_BKPT;
         6069  +      goto abort_due_to_error;
         6070  +    }
         6071  +    sqlite3VdbeMemInit(&m, db, 0);
         6072  +    rc = sqlite3VdbeMemFromBtreeZeroOffset(pCur, (u32)nCellKey, &m);
         6073  +    if( rc ) goto abort_due_to_error;
         6074  +    res = sqlite3VdbeRecordCompareWithSkip(m.n, m.z, &r, 0);
         6075  +    sqlite3VdbeMemRelease(&m);
         6076  +  }
         6077  +  /* End of inlined sqlite3VdbeIdxKeyCompare() */
         6078  +
  5919   6079     assert( (OP_IdxLE&1)==(OP_IdxLT&1) && (OP_IdxGE&1)==(OP_IdxGT&1) );
  5920   6080     if( (pOp->opcode&1)==(OP_IdxLT&1) ){
  5921   6081       assert( pOp->opcode==OP_IdxLE || pOp->opcode==OP_IdxLT );
  5922   6082       res = -res;
  5923   6083     }else{
  5924   6084       assert( pOp->opcode==OP_IdxGE || pOp->opcode==OP_IdxGT );
  5925   6085       res++;
  5926   6086     }
  5927   6087     VdbeBranchTaken(res>0,2);
  5928         -  if( rc ) goto abort_due_to_error;
         6088  +  assert( rc==SQLITE_OK );
  5929   6089     if( res>0 ) goto jump_to_p2;
  5930   6090     break;
  5931   6091   }
  5932   6092   
  5933   6093   /* Opcode: Destroy P1 P2 P3 * *
  5934   6094   **
  5935   6095   ** Delete an entire database table or index whose root page in the database

Changes to src/where.c.

  2560   2560           M = pProbe->aiRowLogEst[saved_nEq];
  2561   2561           logK = estLog(nIn);
  2562   2562           safetyMargin = 10;  /* TUNING: extra weight for indexed IN */
  2563   2563           if( M + logK + safetyMargin < nIn + rLogSize ){
  2564   2564             WHERETRACE(0x40,
  2565   2565               ("Scan preferred over IN operator on column %d of \"%s\" (%d<%d)\n",
  2566   2566                saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
  2567         -          continue;
         2567  +          pNew->wsFlags |= WHERE_IN_SEEKSCAN;
  2568   2568           }else{
  2569   2569             WHERETRACE(0x40,
  2570   2570               ("IN operator preferred on column %d of \"%s\" (%d>=%d)\n",
  2571   2571                saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
  2572   2572           }
  2573   2573         }
  2574   2574         pNew->wsFlags |= WHERE_COLUMN_IN;
................................................................................
  5193   5193         assert( iIndexCur>=0 );
  5194   5194         if( op ){
  5195   5195           sqlite3VdbeAddOp3(v, op, iIndexCur, pIx->tnum, iDb);
  5196   5196           sqlite3VdbeSetP4KeyInfo(pParse, pIx);
  5197   5197           if( (pLoop->wsFlags & WHERE_CONSTRAINT)!=0
  5198   5198            && (pLoop->wsFlags & (WHERE_COLUMN_RANGE|WHERE_SKIPSCAN))==0
  5199   5199            && (pLoop->wsFlags & WHERE_BIGNULL_SORT)==0
         5200  +         && (pLoop->wsFlags & WHERE_IN_SEEKSCAN)==0
  5200   5201            && (pWInfo->wctrlFlags&WHERE_ORDERBY_MIN)==0
  5201   5202            && pWInfo->eDistinct!=WHERE_DISTINCT_ORDERED
  5202   5203           ){
  5203   5204             sqlite3VdbeChangeP5(v, OPFLAG_SEEKEQ);
  5204   5205           }
  5205   5206           VdbeComment((v, "%s", pIx->zName));
  5206   5207   #ifdef SQLITE_ENABLE_COLUMN_USED_MASK
................................................................................
  5355   5356         struct InLoop *pIn;
  5356   5357         int j;
  5357   5358         sqlite3VdbeResolveLabel(v, pLevel->addrNxt);
  5358   5359         for(j=pLevel->u.in.nIn, pIn=&pLevel->u.in.aInLoop[j-1]; j>0; j--, pIn--){
  5359   5360           sqlite3VdbeJumpHere(v, pIn->addrInTop+1);
  5360   5361           if( pIn->eEndLoopOp!=OP_Noop ){
  5361   5362             if( pIn->nPrefix ){
  5362         -            assert( pLoop->wsFlags & WHERE_IN_EARLYOUT );
         5363  +            int bEarlyOut = 
         5364  +                (pLoop->wsFlags & WHERE_VIRTUALTABLE)==0
         5365  +                 && (pLoop->wsFlags & WHERE_IN_EARLYOUT)!=0;
  5363   5366               if( pLevel->iLeftJoin ){
  5364   5367                 /* For LEFT JOIN queries, cursor pIn->iCur may not have been
  5365   5368                 ** opened yet. This occurs for WHERE clauses such as
  5366   5369                 ** "a = ? AND b IN (...)", where the index is on (a, b). If
  5367   5370                 ** the RHS of the (a=?) is NULL, then the "b IN (...)" may
  5368   5371                 ** never have been coded, but the body of the loop run to
  5369   5372                 ** return the null-row. So, if the cursor is not open yet,
  5370   5373                 ** jump over the OP_Next or OP_Prev instruction about to
  5371   5374                 ** be coded.  */
  5372   5375                 sqlite3VdbeAddOp2(v, OP_IfNotOpen, pIn->iCur, 
  5373         -                  sqlite3VdbeCurrentAddr(v) + 2 + 
  5374         -                     ((pLoop->wsFlags & WHERE_VIRTUALTABLE)==0)
  5375         -              );
         5376  +                  sqlite3VdbeCurrentAddr(v) + 2 + bEarlyOut);
  5376   5377                 VdbeCoverage(v);
  5377   5378               }
  5378         -            if( (pLoop->wsFlags & WHERE_VIRTUALTABLE)==0 ){
         5379  +            if( bEarlyOut ){
  5379   5380                 sqlite3VdbeAddOp4Int(v, OP_IfNoHope, pLevel->iIdxCur,
  5380   5381                     sqlite3VdbeCurrentAddr(v)+2,
  5381   5382                     pIn->iBase, pIn->nPrefix);
  5382   5383                 VdbeCoverage(v);
  5383   5384               }
  5384   5385             }
  5385   5386             sqlite3VdbeAddOp2(v, pIn->eEndLoopOp, pIn->iCur, pIn->addrInTop);

Changes to src/whereInt.h.

   615    615   #define WHERE_MULTI_OR     0x00002000  /* OR using multiple indices */
   616    616   #define WHERE_AUTO_INDEX   0x00004000  /* Uses an ephemeral index */
   617    617   #define WHERE_SKIPSCAN     0x00008000  /* Uses the skip-scan algorithm */
   618    618   #define WHERE_UNQ_WANTED   0x00010000  /* WHERE_ONEROW would have been helpful*/
   619    619   #define WHERE_PARTIALIDX   0x00020000  /* The automatic index is partial */
   620    620   #define WHERE_IN_EARLYOUT  0x00040000  /* Perhaps quit IN loops early */
   621    621   #define WHERE_BIGNULL_SORT 0x00080000  /* Column nEq of index is BIGNULL */
          622  +#define WHERE_IN_SEEKSCAN  0x00100000  /* Seek-scan optimization for IN */
   622    623   
   623    624   #endif /* !defined(SQLITE_WHEREINT_H) */

Changes to src/wherecode.c.

   566    566       VdbeCoverageIf(v, !bRev);
   567    567       assert( (pLoop->wsFlags & WHERE_MULTI_OR)==0 );
   568    568   
   569    569       pLoop->wsFlags |= WHERE_IN_ABLE;
   570    570       if( pLevel->u.in.nIn==0 ){
   571    571         pLevel->addrNxt = sqlite3VdbeMakeLabel(pParse);
   572    572       }
   573         -    if( iEq>0 ){
          573  +    if( iEq>0 && (pLoop->wsFlags & WHERE_IN_SEEKSCAN)==0 ){
   574    574         pLoop->wsFlags |= WHERE_IN_EARLYOUT;
   575    575       }
   576    576   
   577    577       i = pLevel->u.in.nIn;
   578    578       pLevel->u.in.nIn += nEq;
   579    579       pLevel->u.in.aInLoop =
   580    580          sqlite3DbReallocOrFree(pParse->db, pLevel->u.in.aInLoop,
................................................................................
   604    604               }
   605    605             }else{
   606    606               pIn->eEndLoopOp = OP_Noop;
   607    607             }
   608    608             pIn++;
   609    609           }
   610    610         }
   611         -      if( iEq>0 ){
          611  +      if( iEq>0 && (pLoop->wsFlags & WHERE_IN_SEEKSCAN)==0 ){
   612    612           sqlite3VdbeAddOp3(v, OP_SeekHit, pLevel->iIdxCur, 0, iEq);
   613    613         }
   614    614       }else{
   615    615         pLevel->u.in.nIn = 0;
   616    616       }
   617    617       sqlite3DbFree(pParse->db, aiMap);
   618    618   #endif
................................................................................
  1800   1800         if( regBignull ){
  1801   1801           sqlite3VdbeAddOp2(v, OP_Integer, 1, regBignull);
  1802   1802           VdbeComment((v, "NULL-scan pass ctr"));
  1803   1803         }
  1804   1804   
  1805   1805         op = aStartOp[(start_constraints<<2) + (startEq<<1) + bRev];
  1806   1806         assert( op!=0 );
         1807  +      if( (pLoop->wsFlags & WHERE_IN_SEEKSCAN)!=0 ){
         1808  +        assert( op==OP_SeekGE );
         1809  +        assert( regBignull==0 );
         1810  +        /* TUNING:  The OP_SeekScan opcode seeks to reduce the number
         1811  +        ** of expensive seek operations by replacing a single seek with
         1812  +        ** 1 or more step operations.  The question is, how many steps
         1813  +        ** should we try before giving up and going with a seek.  The cost
         1814  +        ** of a seek is proportional to the logarithm of the of the number
         1815  +        ** of entries in the tree, so basing the number of steps to try
         1816  +        ** on the estimated number of rows in the btree seems like a good
         1817  +        ** guess. */
         1818  +        sqlite3VdbeAddOp1(v, OP_SeekScan, (pIdx->aiRowLogEst[0]+9)/10);
         1819  +        VdbeCoverage(v);
         1820  +      }
  1807   1821         sqlite3VdbeAddOp4Int(v, op, iIdxCur, addrNxt, regBase, nConstraint);
  1808   1822         VdbeCoverage(v);
  1809   1823         VdbeCoverageIf(v, op==OP_Rewind);  testcase( op==OP_Rewind );
  1810   1824         VdbeCoverageIf(v, op==OP_Last);    testcase( op==OP_Last );
  1811   1825         VdbeCoverageIf(v, op==OP_SeekGT);  testcase( op==OP_SeekGT );
  1812   1826         VdbeCoverageIf(v, op==OP_SeekGE);  testcase( op==OP_SeekGE );
  1813   1827         VdbeCoverageIf(v, op==OP_SeekLE);  testcase( op==OP_SeekLE );
................................................................................
  1902   1916                              nConstraint+bSeekPastNull);
  1903   1917         testcase( op==OP_IdxGT );  VdbeCoverageIf(v, op==OP_IdxGT );
  1904   1918         testcase( op==OP_IdxGE );  VdbeCoverageIf(v, op==OP_IdxGE );
  1905   1919         testcase( op==OP_IdxLT );  VdbeCoverageIf(v, op==OP_IdxLT );
  1906   1920         testcase( op==OP_IdxLE );  VdbeCoverageIf(v, op==OP_IdxLE );
  1907   1921       }
  1908   1922   
  1909         -    if( pLoop->wsFlags & WHERE_IN_EARLYOUT ){
         1923  +    if( (pLoop->wsFlags & WHERE_IN_EARLYOUT)!=0 ){
  1910   1924         sqlite3VdbeAddOp3(v, OP_SeekHit, iIdxCur, nEq, nEq);
  1911   1925       }
  1912   1926   
  1913   1927       /* Seek the table cursor, if required */
  1914   1928       omitTable = (pLoop->wsFlags & WHERE_IDX_ONLY)!=0 
  1915   1929              && (pWInfo->wctrlFlags & WHERE_OR_SUBCLAUSE)==0;
  1916   1930       if( omitTable ){