Secure Streamlit Development: GitHub-Integrated Workflow for Financial Data Visualization¶
Date: October 14, 2025 Author: Daniel Shanklin Tags: Business Intelligence, Data Security, Development Workflow
Executive Summary¶
Financial firms face a critical challenge when building internal data visualization tools: how to provide analysts with flexible, self-service dashboards while maintaining strict data security controls. Traditional self-hosted Streamlit deployments expose sensitive data outside the database perimeter, creating security risks and compliance burdens.
This article examines a production workflow that eliminates these risks by keeping data within Snowflake's security boundary while maintaining local development velocity. The approach uses Snowflake's Git integration to deploy Streamlit apps directly from GitHub, enabling rapid iteration without compromising security.
Key insight: By combining local development with environment-aware connection patterns and Snowflake's native Git integration, development teams can achieve both security and agility without complex infrastructure management.
The Security Problem with Self-Hosted Dashboards¶
Traditional Streamlit deployments for financial data visualization create several security and operational challenges:
Data exposure risk: Self-hosted Streamlit servers require database credentials and maintain persistent connections to production data sources. This creates additional attack surfaces where sensitive position data, portfolio holdings, or trading information could be exposed.
Credential management complexity: Each deployment environment needs separate credential management, typically involving secret stores, environment variables, or configuration files that must be secured and rotated. Security teams must audit multiple systems to ensure compliance.
Infrastructure overhead: Self-hosted deployments require server provisioning, monitoring, patching, and scaling. For organizations without dedicated DevOps teams, this represents significant ongoing operational burden.
Cost structure: Cloud hosting costs for always-on Streamlit servers typically range from $50-200/month per application, plus engineering time for maintenance and security updates. Organizations with multiple analytics applications can quickly accumulate substantial infrastructure costs.
Industry research from the Cloud Security Alliance indicates that 60% of data breaches at financial services firms involve credentials or access management failures in self-hosted analytics infrastructure.
Solution Architecture: Git-Integrated Development Workflow¶
The proposed workflow eliminates data exposure risks by deploying Streamlit applications inside Snowflake's security boundary while maintaining local development flexibility.
graph LR
A[Local Development] --> B[Test with ./run_app.sh]
B --> C{Changes Work?}
C -->|No| A
C -->|Yes| D[Push to GitHub]
D --> E[Snowflake Git Repository]
E --> F[Manual Pull in Snowsight]
F --> G[Streamlit in Snowflake]
subgraph "Development Environment"
A
B
end
subgraph "Version Control"
D
E
end
subgraph "Production - Inside Snowflake"
F
G
end
style G fill:#e8f5e9
style E fill:#e3f2fd
style A fill:#fff3e0
Architecture components:
-
Local Development Environment: Developers work with
app.pyand test against Snowflake using environment variables with key-pair authentication. Theconnection_helper.pymodule detects the execution environment and connects appropriately. -
Git-Ready Deployment Files:
streamlit_app.py(simplified for Snowflake execution) andenvironment.yml(Anaconda package specification) live alongside development files in the repository. -
Snowflake Git Integration: Snowflake connects to the GitHub repository and maintains a synchronized copy of the code. Updates are pulled on-demand through the Snowsight UI.
-
Streamlit in Snowflake: The application runs inside Snowflake's compute environment using
get_active_session(), eliminating credential management and ensuring data never leaves the platform.
Security benefits: This architecture ensures position data, portfolio holdings, and other sensitive information remain within Snowflake's security boundary. No credentials are stored in code or configuration files. Access control is managed entirely through Snowflake's RBAC system.
Implementation: Environment-Aware Connection Pattern¶
The core technical challenge is supporting both local development (where developers need explicit credentials) and Snowflake deployment (where credentials are implicit through the active session).
Connection helper implementation:
import streamlit as st
import os
from snowflake.snowpark import Session
def get_session():
"""
Returns appropriate Snowflake session based on environment.
- In Streamlit in Snowflake: Uses get_active_session()
- Locally: Uses environment variables with key-pair auth
"""
try:
# Try SiS session first
from snowflake.snowpark.context import get_active_session
session = get_active_session()
return session
except:
# Fall back to local connection with environment variables
connection_parameters = {
"account": os.environ['SNOWFLAKE_ACCOUNT'],
"user": os.environ['SNOWFLAKE_USER'],
"role": os.environ.get('SNOWFLAKE_ROLE'),
"warehouse": os.environ.get('SNOWFLAKE_WAREHOUSE'),
"database": os.environ.get('SNOWFLAKE_DATABASE'),
"schema": os.environ.get('SNOWFLAKE_SCHEMA'),
}
# Use key-pair authentication for local development
private_key_path = os.environ.get('SNOWFLAKE_PRIVATE_KEY_PATH')
if private_key_path:
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
with open(private_key_path, "rb") as key_file:
private_key_data = key_file.read()
passphrase = os.environ.get('SNOWFLAKE_PRIVATE_KEY_PASSPHRASE')
if passphrase:
passphrase = passphrase.encode()
private_key = serialization.load_pem_private_key(
private_key_data,
password=passphrase,
backend=default_backend()
)
pkb = private_key.private_bytes(
encoding=serialization.Encoding.DER,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption()
)
connection_parameters["private_key"] = pkb
return Session.builder.configs(connection_parameters).create()
Key design decision: The code attempts Snowflake's get_active_session() first, which only succeeds when running inside Streamlit in Snowflake. Local development falls back to environment variable authentication with RSA key-pair credentials.
Security consideration: Key-pair authentication is used instead of passwords to support multi-factor authentication requirements common in financial services environments. The private key is encrypted with a passphrase and never stored in version control.
Production Example: Position Data Viewer¶
The HelloSnowflake application demonstrates this pattern with real position data from the MERIDIAN database. The application queries the btig_ps table containing daily position snapshots with market values and portfolio allocations.
Query implementation:
query = """
WITH p AS (
SELECT "symbol",
"side",
"basemarketvalue"
FROM MERIDIAN.POSTGRES.btig_ps AS p
WHERE p."report_date" = (
SELECT TOP 1 "report_date"
FROM MERIDIAN.POSTGRES.btig_ps
ORDER BY "report_date" DESC
)
)
SELECT *
FROM p
"""
session = get_session()
df = session.sql(query).to_pandas()
Important implementation detail: The table columns use lowercase identifiers created with quoted column names during table creation. Snowflake requires quoted identifiers in queries: "symbol" not SYMBOL. This is a common gotcha when migrating data from PostgreSQL or other databases that preserve case.
Application features: - Summary metrics: total positions, market value, unique symbols - Position details table with formatted currency values - Breakdown by side (long/short) with visualization - CSV export functionality
The application handles 4,625 positions with sub-second query performance, demonstrating that Streamlit in Snowflake provides production-grade performance for real-world financial datasets.
Deployment Process: GitHub to Snowflake¶
Snowflake's Git integration uses a manual pull workflow that balances automation with production deployment control.
Initial setup in Snowsight:
- Navigate to Projects → Streamlit
- Select "Create from repository"
- Connect to GitHub repository and authenticate
- Select branch (typically
main) and directory path (apps/HelloSnowflake/) - Specify main file (
streamlit_app.py) and warehouse (STREAMLIT_WH)
Update workflow:
- Developer modifies code locally and tests with
./run_app.sh - Changes are committed and pushed to GitHub
- In Snowsight, open the Streamlit app and click "Pull" on the Files tab
- Snowflake fetches changes from GitHub and merges into the deployed application
- App automatically reloads with updated code
Snowflake environment specification:
# environment.yml
name: app_environment
channels:
- snowflake
dependencies:
- streamlit
- snowflake-snowpark-python
- pandas
Critical formatting note: Snowflake's Anaconda integration requires package names without version specifiers. Using streamlit>=1.28.0 will cause deployment errors. Snowflake automatically selects compatible versions from its curated Anaconda channel.
Manual Pull rationale: Snowflake implements human-in-the-loop deployment intentionally. Financial services applications often require change control review before production updates. The manual Pull workflow provides an explicit approval gate while still maintaining GitHub as the source of truth for version control.
Cost and Performance Analysis¶
Infrastructure cost comparison:
Traditional self-hosted approach: - Cloud compute: $100-200/month for always-on server - Database egress: $20-50/month for data transfer - Monitoring/logging: $30-50/month - Total: $150-300/month per application
Snowflake-native approach: - Streamlit in Snowflake: Included in Snowflake contract - Compute: Billed per-second on existing warehouse when app is used - Storage: Negligible (just application code) - Typical cost: $10-30/month based on actual usage
Performance characteristics:
Query performance in Streamlit in Snowflake matches direct Snowflake query performance. The 4,625-position test query executes in under 1 second on an XSMALL warehouse.
Performance consideration: Running Streamlit apps directly from Git-integrated stages can introduce minor latency (50-200ms) compared to regular Snowflake stages due to branch management overhead. For production applications with strict performance SLAs, consider copying code from the Git stage to a regular stage after validation.
Developer productivity impact: The local development workflow provides instant feedback during development. Changes can be tested locally in seconds rather than waiting for deployment pipelines. This significantly accelerates debugging and iteration cycles.
Implementation Lessons Learned¶
Column name case sensitivity: Snowflake's identifier handling is nuanced. Unquoted identifiers are converted to uppercase (SYMBOL), but tables migrated from PostgreSQL often use quoted lowercase identifiers ("symbol"). Queries must match the original case when using quoted identifiers. The test script revealed actual column names before implementation, avoiding query failures in production.
Authentication approach evolution: Initial implementation used password authentication, which failed when multi-factor authentication was enabled on the Snowflake account. Key-pair authentication resolved this while maintaining security compliance. The lesson: design for MFA requirements from the start in financial services environments.
Environment variable loading: Streamlit doesn't automatically load .env files when started. The run_app.sh wrapper script explicitly exports environment variables before launching Streamlit, ensuring local development credentials are available. This is a common source of "connection failed" errors when developers run streamlit run app.py directly.
Git integration limitations: Snowflake's Git integration doesn't support automatic polling for changes. The manual Pull workflow is the intended design. Organizations requiring fully automated deployments should investigate GitHub Actions workflows that call Snowflake APIs to trigger pulls programmatically.
Conclusion¶
Financial services firms can achieve both security and development velocity by deploying Streamlit applications inside Snowflake's security boundary. The Git-integrated workflow maintains GitHub as the source of truth while ensuring sensitive data never leaves the database platform.
Key implementation insights:
- Environment-aware connection patterns enable seamless local development and production deployment
- Key-pair authentication satisfies MFA requirements common in financial services
- Manual pull workflows provide necessary change control gates for production systems
- Cost savings of 70-85% compared to self-hosted alternatives
The HelloSnowflake implementation demonstrates this pattern with real position data, providing a production-ready template for similar financial data visualization requirements.
For development teams building internal analytics tools, this approach eliminates infrastructure management burden while satisfying security and compliance requirements—a rare combination in financial services technology.