AK-DataGeek — Boutique Data Engineering and AI Consultancy. Cloud Data Warehouse Migration, Snowflake, Redshift, BigQuery, AI Systems, Agentic AI Workflows. Trusted by enterprises in aviation, insurance, transport, and fintech. Based in Krakow, Poland.

Boutique data & AI consultancy · Building data systems since 2011

Data engineering for the cloud era.
AI systems for the agentic one.

AK-DataGeek is a boutique data engineering and AI consultancy. We migrate legacy data warehouses to Snowflake, Redshift and BigQuery, design lakehouse architectures, and ship production AI systems with Claude Code and agentic workflows. 13+ years across aviation, insurance, transport and fintech.

13+
Years building
data systems
5×
Cloud DWH migrations
delivered end-to-end
2
Production AI products
shipped & operated
4
Regulated industries
served
Trusted by enterprises across regulated industries

Our work supports operations at institutions you recognise.

01 / Services What we do

Four practices, one principle: data that compounds.

01

Cloud Data Warehouse Migration

Legacy Oracle, Teradata, SQL Server and PostgreSQL migrations to Snowflake, Amazon Redshift, or Google BigQuery. End-to-end ownership: discovery, schema redesign, ELT pipelines, cutover and post-migration validation. Specialised in regulated industries where downtime carries real penalties.

Snowflake Redshift BigQuery dbt Airflow Terraform
02

Data Warehouse & Lakehouse Design

Greenfield warehouse and data lake architecture using Kimball, Data Vault, or hybrid bitemporal approaches. Slowly-changing dimensions, partitioning strategies and cluster keys engineered to survive 10× data growth without rewrites.

Kimball Data Vault PostGIS Iceberg S3 / GCS
03

AI & Machine Learning Systems

Production ML pipelines, signal computation engines, real-time inference layers. Bitemporal feature stores. Customer-facing AI products with backtest-ready architectures. From research notebook to GCP Cloud Run in weeks, not quarters.

PyTorch YOLOv8 RT-DETR DVC W&B FastAPI
04

Agentic AI & Automation

Claude Code workflows, MCP server integration, custom agentic systems for data operations. Replace manual data wrangling with AI agents that ingest, validate, and route. Tooling that turns a single senior engineer into the output of a team.

Claude Code MCP OpenRouter LangChain Custom agents
05

Databricks Architecture & Migration

Lakehouse architecture on Databricks — Delta Lake design, Unity Catalog governance, Spark optimisation, and migration from legacy Hadoop or cloud warehouse stacks. Delivered in partnership with a certified Databricks architect. From proof-of-concept to production-grade data platform.

Databricks Delta Lake Unity Catalog Apache Spark MLflow
02 / Case Studies Selected work

Patterns that compound across engagements.

Client names are kept confidential under NDA. Engagement patterns and outcomes below describe the shape of work we deliver — composite of multiple projects across regulated industries.

Own product · EEC 2026 Semi-finalist
nestis.cloud
Compliance-first SaaS infrastructure for regulated intermediaries — selected as 2026 EEC Startup Challenge semi-finalist in the 4TECH category.
Cloud-native platform handling secure document intake, GDPR-compliant audit trails, and AI-powered processing automation for regulated brokers and intermediaries in CEE. Built on FastAPI, PostgreSQL, GCP Cloud Run, with Document AI and RPA layers. Architected, built, and operated end-to-end with AI-augmented workflows.
Python FastAPI PostgreSQL GCP Document AI React
4TECH
EEC category
2026 semi-finals
CEE
Geographic
focus
EEC Startup Challenge 2026 — Nestis.cloud półfinalista kategorii 4TECH
Own product · Under construction
straitsmonitor.com · ARGUS Alpha
Commodity flow intelligence platform — real-time vessel monitoring, energy demand signals, computed indices for trading desks.
End-to-end data platform combining real-time AIS WebSocket ingestion, multi-source enrichment (energy, markets, vessel master data), bitemporal signal computation, and GCP-native architecture with BigQuery EAV patterns. Serves commodity traders, macro funds, and compliance desks. Architected, built, and operated by AK-DataGeek.
BigQuery Cloud Run Python FastAPI Terraform PostGIS
7+
Data sources
integrated
<30s
End-to-end
latency
3
Trader
personas served
Regulated Sector · Enterprise migration
Legacy → Cloud HR Data Platform
Client confidential
Strategic migration of enterprise-scale workforce data from legacy on-premise to a modern cloud lakehouse stack.
Multi-quarter engagement with a major aviation enterprise. Migration of HR datasets from legacy environment to Snowflake/dbt/Airflow, owning deployment across dev, test and production environments within Scrum delivery framework. Modular ELT pipelines designed for version control, code review and full SDLC discipline.
Snowflake dbt Airflow Python Scrum
3
Environments
owned end-to-end
100%
SDLC
discipline
Insurance Fintech · Multi-year
Greenfield Cloud Data Warehouse
Client confidential
Three-year engagement building a fintech's first centralised cloud data warehouse from the ground up.
Initiated and architected a centralised DWH consolidating disparate operational and marketing data sources. Migrated existing PostgreSQL warehouse to Amazon Redshift with proper query optimisation. Delivered self-service analytics enablement through BI integration. Used Kimball and Data Vault methodologies side-by-side.
Amazon Redshift Pandas / PySpark Kimball Data Vault Looker
3y
Engagement
length
JSON
Complex parsing
at scale
Transport · Mission-critical
Geospatial Data Warehouse Migration
Client confidential
Migration of safety-critical geospatial data infrastructure from on-premise stack to AWS Redshift.
Large-scale ELT pipeline build for GPS-generated geospatial data feeding fleet management, route optimisation, and operational efficiency analytics. Multi-billion record migration with full audit trail and zero downtime requirements. Foundational work in spatial data handling using Python's geospatial libraries.
AWS Redshift Python SnapLogic GPS / XML data PostGIS
9B
Records
migrated
0
Hours of
downtime
Insurance · UK Enterprise
Insurance DWH Migration
Client confidential
End-to-end DWH migration for one of UK's largest insurance groups, with PL/pgSQL automation.
Designed and delivered ELT pipelines, owned deployment across dev/test/prod environments. Built complex stored procedures to automate data processing, transformation and validation tasks. Cross-functional collaboration on schema validation, data quality checks and post-migration testing in highly regulated insurance sector.
PostgreSQL PL/pgSQL SQL Server ELT
3
Environments
owned
UK
Regulated
insurance sector
03 / Method How we work

AI-augmented engineering. Boutique scale, enterprise output.

Claude Code, MCP, custom agents — production-grade since day one.

Modern data work isn't done by hand anymore. AK-DataGeek leans heavily on AI-augmented workflows — Claude Code for engineering, MCP servers for tool integration, OpenRouter for cost-optimised model routing — to deliver in weeks what traditional teams take quarters to ship.

What this means for clients: shorter cycles, lower cost, higher quality. A migration that would take a 3-person team six months gets done in eight weeks, with better test coverage and cleaner architectural documentation.

AI doesn't replace senior judgment — it amplifies it. No code ships unreviewed. Every commit goes through human review, every architectural decision lives in an ADR, every customer-facing output is verified against source. That's the line.

04 / Approach How we engage

Five principles that shape every engagement.

01

Senior-only delivery

Every line of code, every architectural decision, every client-facing artefact comes from senior hands. We don't sell juniors at senior rates. The trade-off is scale — we take one or two engagements at a time and stay deep in them. Most of our clients prefer this trade.

02

End-to-end ownership

Discovery, design, build, deploy, validate, document — all under one roof. We don't hand off between specialists. The same person who designs the Snowflake schema writes the dbt models, configures the Airflow DAG, and explains the trade-offs to your CTO. This eliminates the most expensive failure mode in consulting: lossy handoffs.

03

Documentation is part of the deliverable

Every architectural decision lives in an ADR (Architecture Decision Record). Every schema change has a migration script. Every data contract has a test. When the engagement ends, your team owns a system they can run, debug and extend without us. That's the whole point.

04

AI as a force multiplier, not a black box

We use Claude Code, MCP servers, and custom agents heavily — but every artefact is human-reviewed before it ships. AI accelerates the routine 70% of engineering work; senior judgment owns the 30% that matters. Clients get faster delivery without inheriting opaque AI-generated code they cannot understand or maintain.

05

Built in regulated environments

Insurance, banking, transport, regulated tech — sectors where data correctness, audit trails and compliance are not optional. Our architectural defaults assume regulators will read the audit log. Bitemporal modelling, point-in-time correctness, full lineage and test coverage are baseline, not premium add-ons.

05 / Stack Tools of the trade

Battle-tested stack. Pragmatic choices, not hype.

Languages
Python · PySpark
T-SQL · PL/SQL
PL/pgSQL
ANSI SQL
Pandas
Cloud Warehouses
Snowflake
Amazon Redshift
BigQuery
Oracle · Teradata
PostgreSQL · MongoDB
Cloud & Infra
AWS · GCP
S3 · Glue · Athena
Lambda · Cloud Run
Terraform
Docker
Orchestration & ELT
Apache Airflow
dbt
Cloud Composer
SnapLogic
Git · CI/CD
ML & AI
Claude Code
PyTorch · YOLOv8
RT-DETR
Weights & Biases
DVC · MLflow
Agentic Layer
MCP servers
OpenRouter routing
Perplexity · Gemini
Custom agent loops
Hermes · Paperclip AI
Geospatial
PostGIS
rasterio · pyproj
fiona · GDAL
QGIS · ArcGIS
PUWG / WGS84
Methods
Kimball · Data Vault
Bitemporal modelling
SCD Type 2 · EAV
Scrum · Agile delivery
ADR-driven design
06 / FAQ Common questions

Answers to questions before you ask them.

What does AK-DataGeek do?

AK-DataGeek is a boutique data engineering and AI consultancy. We specialise in four practices: cloud data warehouse migrations from legacy systems like Oracle, Teradata and SQL Server to modern platforms such as Snowflake, Amazon Redshift and Google BigQuery; data warehouse and lakehouse design using Kimball and Data Vault methodologies; production AI and machine learning systems; and agentic AI workflows using tools like Claude Code and MCP servers.

Who has AK-DataGeek worked with?

Our team has delivered data engineering work for enterprises across regulated industries including aviation (British Airways), insurance (Aviva, Cuvva), transport (Transport for London, Arriva Group), fintech (Capitalise.com), logistics (UPS), engineering (BRE Group), and through technology consultancies (CodiLime). Specific project details are kept confidential under client NDAs — case studies on this site present the shape and outcome of work without naming individual engagements.

What technologies does AK-DataGeek work with?

Core stack covers Python, PySpark, T-SQL, PL/SQL and ANSI SQL. Cloud warehouses: Snowflake, Amazon Redshift, Google BigQuery, plus legacy Oracle, Teradata and PostgreSQL. Cloud platforms: AWS and Google Cloud Platform. Orchestration: Apache Airflow, dbt, Cloud Composer. AI and ML: Claude Code, PyTorch, YOLOv8, Weights & Biases, DVC. Agentic layer: MCP servers, OpenRouter, Perplexity, Gemini, custom agent loops.

How is AK-DataGeek different from a traditional consulting firm?

AK-DataGeek is a boutique consultancy that uses AI-augmented workflows to deliver senior-level work at a fraction of traditional consulting timelines. By leveraging Claude Code, MCP servers, and custom agent systems, we produce the output of a small team — with shorter delivery cycles, lower cost, and clean architectural documentation. The model trades scale for speed and depth. Clients work directly with senior engineers, never with junior staff billing at senior rates.

Where is AK-DataGeek based and where does it operate?

AK-DataGeek is based in Krakow, Poland and operates remote-first. Engagements are accepted across Poland, the United Kingdom, Germany, and the broader European Union. English-speaking engagements globally are accepted on a selective basis. Both Polish and English are working languages.

Does AK-DataGeek build its own products?

Yes. AK-DataGeek currently runs two production AI products. Nestis.cloud is a compliance-first SaaS infrastructure for regulated intermediaries, selected as a 2026 EEC Startup Challenge semi-finalist in the 4TECH category. Straitsmonitor.com is a commodity flow intelligence platform serving trading desks. Both products demonstrate our ability to ship production AI systems end-to-end.

How can I hire AK-DataGeek?

Contact AK-DataGeek via email at hello@ak-datageek.com or through LinkedIn. Engagements are accepted selectively — typically one or two projects at a time — with availability from Q3 2026. The first conversation is free and exploratory; we'll figure out fit before discussing scope or pricing.

Let's build something that ships.

Available for selective consulting engagements from Q3 2026 — DWH migrations, AI/ML systems, agentic automation. One or two projects at a time, deeply hands-on. Drop a line if that fits how you work.

Send a message