adding additional checks to improve index combatibility validation#18751
adding additional checks to improve index combatibility validation#18751RajaVinoth96 wants to merge 7 commits into
Conversation
| tableConfig = new TableConfigBuilder(TableType.OFFLINE).setTableName(TABLE_NAME) | ||
| .setNoDictionaryColumns(List.of("myCol2")) | ||
| .setInvertedIndexColumns(List.of("myCol1")) | ||
| .setSortedColumn("myCol1") |
There was a problem hiding this comment.
Removed this because it is redundant to have an inverted index on a sorted column
| assertValid(tc); | ||
| } | ||
|
|
||
| @Test |
There was a problem hiding this comment.
Dictionary encoding on json index is in itself an anti pattern
|
|
||
| @Test | ||
| public void testTextIndexStringColumnDictPasses() { | ||
| FieldConfig fc = new FieldConfig(STR_COL, EncodingType.DICTIONARY, IndexType.TEXT, null, null); |
There was a problem hiding this comment.
Dictionary encoding on text index is an anti pattern
|
|
||
| @Test | ||
| public void testTextIndexNonStringColumnFails() { | ||
| FieldConfig fc = new FieldConfig(INT_COL, EncodingType.DICTIONARY, IndexType.TEXT, null, null); |
There was a problem hiding this comment.
Dictionary encoding on text index is an anti pattern
| public void testRawWithExplicitDictAndInvertedPlusFstPasses() { | ||
| ObjectNode indexes = JsonUtils.newObjectNode(); | ||
| indexes.set("dictionary", JsonUtils.newObjectNode()); | ||
| FieldConfig fc = new FieldConfig(STR_COL, EncodingType.RAW, null, List.of(IndexType.FST), |
There was a problem hiding this comment.
Having both FST and inverted enabled on the same column is redundant.
There was a problem hiding this comment.
Pull request overview
This PR strengthens table/index configuration validation in pinot-segment-local by adding additional index-combination compatibility checks (and adjusting/removing some tests accordingly), plus a number of formatting-only cleanups.
Changes:
- Add new “redundant/anti-pattern” index combination checks (e.g., sorted-column vs other indexes; vector/text/range/json/etc vs other indexes).
- Update and remove certain unit tests in
IndexCombinationValidationTest/TableConfigUtilsTestto reflect the new validation expectations. - Apply assorted formatting/Javadoc/whitespace changes across several index type implementations.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| pinot-segment-local/src/test/java/org/apache/pinot/segment/local/utils/TableConfigUtilsTest.java | Removes a sorted-column setting in one test case. |
| pinot-segment-local/src/test/java/org/apache/pinot/segment/local/utils/IndexCombinationValidationTest.java | Removes/adjusts some index-combination validation tests. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java | Adds new validation for “redundant indexes on sorted columns” + formatting updates. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/VectorIndexType.java | Adds validation preventing vector index coexisting with several other indexes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/text/TextIndexType.java | Adds validation preventing text index coexisting with several other indexes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/range/RangeIndexType.java | Adds validation preventing range index coexisting with several other indexes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/json/JsonIndexType.java | Adds validation for JSON index (including new multi-value restriction) + additional anti-pattern checks. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/inverted/InvertedIndexType.java | Adds validation preventing inverted index coexisting with several other indexes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/h3/H3IndexType.java | Adds validation preventing H3 index coexisting with several other indexes + formatting changes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/fst/IFSTIndexType.java | Adds validation preventing IFST index coexisting with several other indexes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/fst/FstIndexType.java | Adds validation preventing FST index coexisting with several other indexes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/dictionary/DictionaryIndexType.java | Adds validation preventing dictionary index coexisting with several other indexes. |
| pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/bloom/BloomIndexType.java | Adds validation preventing bloom filter coexisting with several other indexes. |
| .setSortedColumn("myCol1") | ||
| .build(); | ||
| try { | ||
| // Enable forward index disabled flag for a column with inverted index and is sorted |
| DataType storedType = fieldSpec.getDataType().getStoredType(); | ||
| Preconditions.checkState(storedType == DataType.STRING || storedType == DataType.MAP, | ||
| "Cannot create JSON index on column: %s of stored type other than STRING or MAP", column); | ||
| Preconditions.checkState(fieldSpec.isSingleValueField(), "Cannot create JSON index on multi-value column: %s", | ||
| column); |
| for (IndexType indexType : List.of( | ||
| StandardIndexes.vector(), | ||
| StandardIndexes.json(), | ||
| StandardIndexes.text(), | ||
| StandardIndexes.h3())) { |
| for (IndexType indexType : List.of( | ||
| StandardIndexes.bloomFilter(), | ||
| StandardIndexes.dictionary(), | ||
| StandardIndexes.inverted(), | ||
| StandardIndexes.vector(), | ||
| StandardIndexes.range(), | ||
| StandardIndexes.json(), | ||
| StandardIndexes.fst(), | ||
| StandardIndexes.h3(), | ||
| StandardIndexes.ifst())) { | ||
| Preconditions.checkState(indexConfigs.getConfig(indexType).isDisabled(), | ||
| "Anti pattern to enable both text index and %s on column: %s", | ||
| indexType.getPrettyName(), fieldSpec.getName()); | ||
| } |
| for (IndexType indexType : List.of( | ||
| StandardIndexes.bloomFilter(), | ||
| StandardIndexes.inverted(), | ||
| StandardIndexes.vector(), | ||
| StandardIndexes.json(), | ||
| StandardIndexes.text(), | ||
| StandardIndexes.fst(), | ||
| StandardIndexes.h3(), | ||
| StandardIndexes.ifst())) { |
| for (IndexType indexType : List.of(StandardIndexes.bloomFilter(), StandardIndexes.inverted(), | ||
| StandardIndexes.range())) { | ||
| Preconditions.checkState(indexConfigsForSortedColumn.getConfig(indexType).isDisabled(), | ||
| "Redundant to enable %s on a sorted column: %s", indexType.getPrettyName(), fieldSpec.getName()); | ||
| } |
| public IndexHandler createIndexHandler(SegmentDirectory segmentDirectory, Map<String, FieldIndexConfigs> configsByCol, | ||
| Schema schema, TableConfig tableConfig) { | ||
| Schema schema, TableConfig tableConfig) { | ||
| return new H3IndexHandler(segmentDirectory, configsByCol, tableConfig, schema); |
| protected H3IndexReader createIndexReader(PinotDataBuffer dataBuffer, ColumnMetadata metadata, | ||
| H3IndexConfig indexConfig) { | ||
| H3IndexConfig indexConfig) { | ||
| return new ImmutableH3IndexReader(dataBuffer); |
| for (IndexType indexType : List.of( | ||
| StandardIndexes.bloomFilter(), | ||
| StandardIndexes.vector(), | ||
| StandardIndexes.range(), | ||
| StandardIndexes.json(), | ||
| StandardIndexes.text(), | ||
| StandardIndexes.fst(), | ||
| StandardIndexes.h3(), | ||
| StandardIndexes.ifst())) { |
The PR adds additional check to enforce better index type compatibility checks.
Some of test cases have been removed or altered because the index configuration is either redundant or an anti pattern. Comments added to explain why it is the case.
Index_compatibility_matrix.xlsx