Skip to content

Latest commit

 

History

History
134 lines (101 loc) · 2.86 KB

File metadata and controls

134 lines (101 loc) · 2.86 KB

🔧 Fix: Create Missing Bronze Tables

The Problem

Your ETL is working perfectly - pagination fetched thousands of records! 🎉

But inserts are failing because tables don't exist in the bronze schema:

❌ Error inserting batch: "Could not find the table 'public.al_cohorts'"

✅ The Solution (2 Steps)

Step 1: Generate SQL for All Tables

Run this command to discover all API endpoints and generate the SQL:

python create_bronze_tables.py > bronze_tables.sql

This will:

  • Discover all 39 endpoints from the API
  • Generate SQL for tables like:
    • bronze.al_cohorts
    • bronze.al_courses
    • bronze.csusb_cohorts
    • bronze.csusb_courses
    • etc.

Step 2: Run SQL in Supabase

  1. Open: Supabase SQL Editor

  2. Copy contents of bronze_tables.sql

  3. Paste and execute

  4. Verify tables were created:

SELECT table_name 
FROM information_schema.tables 
WHERE table_schema = 'bronze'
ORDER BY table_name;

Step 3: Re-run ETL

python bronze_etl.py

Now it will successfully load data! ✅

🚀 Quick Alternative: Use --create-tables

If you want to see SQL for just the endpoints that have data:

# This will generate SQL only for tables with data
python bronze_etl.py --create-tables --dry-run | grep "CREATE TABLE" -A 10 > tables_to_create.sql

What Was Fixed

Before ❌

# Tried to insert into public.al_cohorts (wrong schema)
supabase.table('al_cohorts').insert(...)  # Uses public by default

After ✅

# Now correctly uses bronze schema
supabase.postgrest.schema('bronze')  # Set schema
supabase.table('al_cohorts').insert(...)  # Inserts into bronze.al_cohorts

📊 Expected Results

After fixing, you should see:

🔄 Fetching /al/cohorts (expected: 1800 records)
   📄 Page 1: 100 records (total so far: 100)
   ...
   📄 Page 18: 100 records (total so far: 1800)
✅ Fetched 1800 total records from /al/cohorts
   Table: bronze.al_cohorts
✅ Inserted batch 1: 1000 records
✅ Inserted batch 2: 800 records
✅ Successfully loaded 1800 records into bronze.al_cohorts ✅

Verify Data Loaded

-- Check record counts
SELECT 
    'al_cohorts' as table_name,
    COUNT(*) as records
FROM bronze.al_cohorts
UNION ALL
SELECT 
    'al_courses',
    COUNT(*)
FROM bronze.al_courses
UNION ALL
SELECT 
    'csusb_cohorts',
    COUNT(*)
FROM bronze.csusb_cohorts;

-- View sample data
SELECT 
    id,
    data->>'id' as record_id,
    source_endpoint,
    ingested_at
FROM bronze.al_cohorts
LIMIT 5;

🎉 Summary

Problem: Tables didn't exist in bronze schema
Solution: Generate and run table creation SQL
Pagination: Already working perfectly! ✅
Next: Load all your data successfully! 🚀


Need help? Check BRONZE_ETL_README.md or PAGINATION_GUIDE.md