schema_2_encoder

This module provides utilities for mapping SQLAlchemy type objects to simplified type representations suitable for LLM consumption. It handles both generic SQLAlchemy types (e.g., String, Integer) and SQL standard types (e.g., VARCHAR, BIGINT).

mcp_ohmy_sql.db.relational.schema_2_encoder.encode_column_info(table_info: TableInfo, column_info: ColumnInfo) str[source]

Encode a database column into LLM-friendly compact format.

Transforms verbose column metadata into a concise string representation optimized for Large Language Model consumption in text-to-SQL tasks.

Format: ${COLUMN_NAME}:${DATA_TYPE}${PRIMARY_KEY}${UNIQUE}${NOT_NULL}${INDEX}${FOREIGN_KEY}

Note

There might be multiple Foreign Keys encoded as *FK->Table1.Column1*FK->Table2.Column2.

Constraints are encoded as:

  • *PK: Primary Key (implies unique and indexed)

  • *UQ: Unique constraint (implies indexed)

  • *NN: Not Null constraint

  • *IDX: Has database index

  • *FK->Table.Column: Foreign key reference

Redundant constraints are automatically omitted (PK/UQ don’t show IDX).

Parameters:
  • table_info – Table metadata containing primary key information

  • column_info – Column metadata with type, constraints, and relationships

Returns:

Compact column representation string

Examples:

  • Primary key column: UserId:INT*PK

  • Foreign key with index: CategoryId:INT*NN*IDX*FK->Category.CategoryId

  • Unique email field: Email:STR*UQ*NN

  • Simple nullable column: Description:STR

mcp_ohmy_sql.db.relational.schema_2_encoder.encode_table_info(table_info: TableInfo) str[source]

Encode a database table into LLM-friendly compact format.

Format:

Table TableName(
    encoded_column_info_1,
    encoded_column_info_2,
    ...
)

This format provides:

  • Immediate visual structure similar to SQL CREATE TABLE

  • Compact representation reducing token usage by ~70%

  • Preserves all essential schema information

  • Self-documenting constraint annotations

  • Clear foreign key relationships

Parameters:

table_info – Table metadata containing columns and foreign keys

Returns:

Compact table representation string

Example:

Table Product(
    ProductId:INT*PK,
    ProductName:STR*NN,
    CategoryId:INT*NN*FK->Category.CategoryId,
    Price:DEC*NN,
    Stock:INT*NN,
    CreatedAt:TS*NN,
    UpdatedAt:TS
)

# or
View SalesReport(
    ...
)

# or
MaterializedView MonthlySales(
    ...
)
mcp_ohmy_sql.db.relational.schema_2_encoder.encode_schema_info(schema_info: SchemaInfo) str[source]

Encode a database schema into LLM-friendly compact format.

Format:

Schema SchemaName(
    encoded_table_info_1,
    encoded_table_info_2,
    ...,
)

Key benefits for LLM consumption:

  • Token Efficiency: Reduces schema representation by ~70% compared to verbose SQL DDL or JSON formats

  • Semantic Clarity: Constraint abbreviations (PK, FK, NN) are intuitive and consistently applied

  • Relationship Visibility: Foreign keys show target table/column inline, enabling quick relationship understanding

  • Type Simplification: Database-specific types mapped to universal categories (STR, INT, DEC, etc.)

  • Hierarchical Structure: Clear nesting shows schema->table->column relationships

Parameters:

schema_info – Schema metadata containing all tables and relationships

Returns:

Compact schema representation string

Example:

Schema ecommerce(
    Table Customer(
        CustomerId:INT*PK,
        Email:STR*UQ*NN,
        FirstName:STR*NN,
        LastName:STR*NN,
        CreatedAt:TS*NN
    ),
    Table Order(
        OrderId:INT*PK,
        CustomerId:INT*NN*FK->Customer.CustomerId,
        OrderDate:DT*NN,
        TotalAmount:DEC*NN,
        Status:STR*NN
    ),
    Table OrderItem(
        OrderItemId:INT*PK,
        OrderId:INT*NN*FK->Order.OrderId,
        ProductId:INT*NN*FK->Product.ProductId,
        Quantity:INT*NN,
        UnitPrice:DEC*NN
    )
)
mcp_ohmy_sql.db.relational.schema_2_encoder.encode_database_info(database_info: DatabaseInfo) str[source]

Encode a database into LLM-friendly compact format.

Format:

DatabaseType Database DatabaseName(
    Schema SchemaName(
        encoded_table_info_1,
        encoded_table_info_2,
        ...,
    ),
    ...
)
Parameters:

database_info – Database metadata containing schemas and tables

Returns:

Compact database representation string