Let’s expose some metadata for our database.
There is a SQL standard to expose this based on a schema called INFORMATION_SCHEMA
. We are going to use a simpler syntax based on Snowflake. As we discussed before parsing is not our main topic, so here simpler is perfectly OK.
We are just going to show all existing tables, ergo we are going to add a command called SHOW TABLES
.
Our implementation is quite simple. Some would say trivial and naïve.
Engine
The code for the calm engine is available here.
After we change the top-level entry function to dispatch the SHOW TABLES
command, we implement is as follows:
def show_tables() -> SomeShowTablesResult:
table_names = [p.stem for p in DATABASE_PATH.glob("*.csv")]
return SomeShowTablesResult(table_names=table_names)
We just return the files names in the database directory without the CSV extension 😳.
Parsing
Let’s have a look at the calm parser. The code is available here.
We have a new “SQL” statement, so the starting point from the basic version:
SomeSQLStatement = SomeCreateTable | SomeInsertInto | SomeSelect
becomes:
SomeSQLStatement = SomeCreateTable | SomeInsertInto | SomeSelect | SomeShowTables
SHOW TABLES
is not standard SQL hence the parsing library we use doesn’t recognize it. That is fine…
Here is our new parse
function entry point:
def parse(statement_text: str) -> SomeSQLStatement:
p = sqlparse.parse(statement_text)
stmt = p[0]
statement_type_text = stmt.get_type()
if statement_type_text == "CREATE":
return parse_create_table(stmt)
elif ...
elif statement_type_text == "UNKNOWN":
# sqlparse doesn't know about SHOW TABLES
tokens = [token for token in stmt.tokens if not token.is_whitespace]
if (
len(tokens) == 2
and tokens[0].value.upper() == "SHOW"
and tokens[1].value.upper() == "TABLES"
):
return SomeShowTables()
else:
raise ValueError(f"Still not processing {statement_text}")
else:
raise ValueError(f"Still not processing {statement_type_text}")
Because the parser library doesn’t know about the keywords SHOW
and TABLES
, their type is UNKNOWN
. Fortunately the library that we use is OK with that.
We added a trivial Pydantic model to support this:
class SomeShowTables(SomeSQLStatementBase):
pass
Final words
This is easy but useful.
The next post will start going into the meat of things that will occupy us for a long time: WHERE
clauses.