Skip to content Skip to sidebar Skip to footer

Use Temp Table With SQLAlchemy

I am trying to use use a temp table with SQLAlchemy and join it against an existing table. This is what I have so far engine = db.get_engine(db.app, 'MY_DATABASE') df = pd.DataFram

Solution 1:

In case the number of records to be inserted in the temporary table is small/moderate, one possibility would be to use a literal subquery or a values CTE instead of creating temporary table.

# MODEL
class ExistingTable(Base):
    __tablename__ = 'existing_table'
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String)
    # ...

Assume also following data is to be inserted into temp table:

# This data retrieved from another database and used for filtering
rows = [
    (1, 100, datetime.date(2017, 1, 1)),
    (3, 300, datetime.date(2017, 3, 1)),
    (5, 500, datetime.date(2017, 5, 1)),
]

Create a CTE or a sub-query containing that data:

stmts = [
    # @NOTE: optimization to reduce the size of the statement:
    # make type cast only for first row, for other rows DB engine will infer
    sa.select([
        sa.cast(sa.literal(i), sa.Integer).label("id"),
        sa.cast(sa.literal(v), sa.Integer).label("value"),
        sa.cast(sa.literal(d), sa.DateTime).label("date"),
    ]) if idx == 0 else
    sa.select([sa.literal(i), sa.literal(v), sa.literal(d)])  # no type cast

    for idx, (i, v, d) in enumerate(rows)
]
subquery = sa.union_all(*stmts)

# Choose one option below.
# I personally prefer B because one could reuse the CTE multiple times in the same query
# subquery = subquery.alias("temp_table")  # option A
subquery = subquery.cte(name="temp_table")  # option B

Create final query with the required joins and filters:

query = (
    session
    .query(ExistingTable.id)
    .join(subquery, subquery.c.id == ExistingTable.id)
    # .filter(subquery.c.date >= XXX_DATE)
)

# TEMP: Test result output
for res in query:
    print(res)    

Finally, get pandas data frame:

out_df = pd.read_sql(query.statement, engine)
result = out_df.to_dict('records')

Solution 2:

You can try to use another solution - Process-Keyed Table

A process-keyed table is simply a permanent table that serves as a temp table. To permit processes to use the table simultaneously, the table has an extra column to identify the process. The simplest way to do this is the global variable @@spid (@@spid is the process id in SQL Server).

...

One alternative for the process-key is to use a GUID (data type uniqueidentifier).

http://www.sommarskog.se/share_data.html#prockeyed


Post a Comment for "Use Temp Table With SQLAlchemy"