Skip to content

Conversation

zaleslaw
Copy link
Collaborator

Enhanced containsForbiddenPatterns to use regex for stricter checks on SQL keywords and patterns, disallowing forbidden elements like comments and multiple statements. Updated isValidSqlQuery documentation for clarity. Added a test to validate reading from a table with column names containing reserved SQL keywords.

Fixes #1307

Enhanced `containsForbiddenPatterns` to use regex for stricter checks on SQL keywords and patterns, disallowing forbidden elements like comments and multiple statements. Updated `isValidSqlQuery` documentation for clarity. Added a test to validate reading from a table with column names containing reserved SQL keywords.
@zaleslaw zaleslaw requested review from Copilot and koperagen July 15, 2025 15:26
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR strengthens SQL validation by switching to regex-based forbidden-pattern checks, clarifies the isValidSqlQuery documentation, and adds a test for reading from tables with columns named using reserved keywords.

  • Use regex patterns in containsForbiddenPatterns for stricter detection.
  • Minor documentation updates in isValidSqlQuery.
  • New H2 test for tables whose column names include reserved SQL keywords.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
dataframe-jdbc/src/test/kotlin/org/jetbrains/kotlinx/dataframe/io/h2/h2Test.kt Added a test verifying SELECT works when column names include reserved keywords
dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/readJdbc.kt Switched forbiddenPatterns to regex, improved error logging, and updated Javadoc examples
Comments suppressed due to low confidence (2)

dataframe-jdbc/src/test/kotlin/org/jetbrains/kotlinx/dataframe/io/h2/h2Test.kt:796

  • [nitpick] Consider simplifying the test function name (e.g., read from table with column name containing reserved SQL keyword) for clarity and conciseness.
    fun `read from table with column name containing the reserved SQL keywords`() {

dataframe-jdbc/src/test/kotlin/org/jetbrains/kotlinx/dataframe/io/h2/h2Test.kt:814

  • Add tests for the newly documented forbidden patterns (e.g., queries containing --, /* */, or multiple statements separated by ;) to verify that containsForbiddenPatterns correctly rejects them.
    }


@Test
fun `read from table with column name containing the reserved SQL keywords`() {
@Language("SQL")
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] For the multi-line SQL string, consider appending trimIndent() (or trimMargin()) to remove leading whitespace and ensure the query is formatted as intended.

Copilot uses AI. Check for mistakes.

Comment on lines 430 to 434
)

// Use regular expressions to match reserved words only as separate entities
for (pattern in forbiddenPatterns) {
if (input.contains(pattern)) {
val regex = Regex(pattern, RegexOption.IGNORE_CASE)
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider precompiling the forbiddenPatterns into Regex instances once (e.g., at initialization) instead of compiling a new Regex on every call to improve performance.

Copilot uses AI. Check for mistakes.

@zaleslaw zaleslaw merged commit ecf48c5 into master Jul 15, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ForbiddenPatterns validation failed for SELECT last_update FROM sakila.staff

2 participants