Available for opportunities · Veszprém, Hungary

Abonyi János
Software Developer & AI Engineer

Junior software developer focused on backend engineering and applied AI — RAG systems, LLM orchestration, and polyglot data architectures.

Ask my AI about me ↓View projects

About

Backend engineering meets applied AI.

I'm a junior software developer from Veszprém, Hungary, currently pursuing an MSc in Computer Science Engineering at the University of Pannonia, after completing my BSc at Eötvös Loránd University (ELTE) with a Software Design specialization.

I have a passion for learning how applied-AI systems work — retrieval-augmented generation, multi-database orchestration with LLM tool calling, and ML pipelines.

I care about systems that are correct under failure: idempotent handlers, transactional outboxes, guardrails before data crosses a trust boundary. I also enjoy turning messy real-world data into something searchable and useful.

Selected work

Projects

Applied-AI and data systems, from Corrective-RAG to polyglot persistence.

RAGAI SafetyGuardrails

Autonomous Compliance Auditor

Corrective-RAG service that audits responses for legal & privacy compliance.

A LangGraph-orchestrated Corrective-RAG (CRAG) pipeline that audits knowledge-base and bot responses for adherence to legal, GDPR/HIPAA privacy, and company policy. It screens input for prompt-injection and jailbreaks, redacts PII before anything reaches the LLM, grades retrieved documents, and falls back to web search when local context is insufficient.

Fail-safe ordering: input guardrails → PII redaction happen before text reaches embeddings or the LLM.
Two-stage input screening — cheap regex first, LLM classifier second.
CRAG correction: documents are graded; weak local context triggers a Tavily web search before generation.
Grounded generation as a structured ComplianceAudit, re-checked for grounding/toxicity with one bounded self-correction.

LangGraphOpenAIPineconePresidioTavilyFastAPIPydantic

View code on GitHub

GraphRAGDistributed SystemsPolyglot Persistence

Digital HR Architect

Polyglot-persistence HR brain with multi-hop GraphRAG reasoning.

An AI-powered HR platform that unifies five database paradigms — Relational, Graph, Document, Vector, and Big Data — behind one orchestration layer. An LLM uses function calling to route natural-language questions across PostgreSQL, Neo4j, Firestore, Pinecone, and BigQuery ML, chaining results to answer questions no single store can.

Multi-hop GraphRAG: e.g. 'find a great listener mentored by someone who knows Python' chains Pinecone → Neo4j → PostgreSQL → BigQuery ML.
Transactional outbox pattern keeps five stores consistent without a distributed commit — business row + event commit atomically in one PG transaction.
At-least-once delivery with idempotent handlers (MERGE / upsert), dead-letter queue, and a drift detector across all stores.
BigQuery ML logistic-regression model predicts employee flight risk directly in SQL.

PostgreSQLNeo4jFirestorePineconeBigQuery MLOpenAICloud Pub/SubStreamlit

View code on GitHub

RAGFull-stackReranking

Quiz Solver

RAG study assistant that answers exam questions from your own documents.

Upload PDF / DOCX / TXT study materials — including scanned PDFs — and get instant, source-cited answers to any question, including A/B/C/D multiple choice. Each subject lives in its own isolated knowledge base with full Q&A history.

Retrieve top-20 from Pinecone, rerank to top-5 with Cohere, then answer with GPT-4o at low temperature for grounded responses.
3-stage PDF extraction with OCR fallback (pdfjs-dist → pdf-parse → Mistral OCR) for fully scanned documents.
Per-knowledge-base Pinecone namespaces for full document isolation.
Answers rendered as Markdown + LaTeX (KaTeX), always in the question's language.

Next.jsReactTypeScriptOpenAIPineconeCohereSupabaseMistral OCR

View code on GitHub

CloudComputer VisionBackend

PhotoVault

Upload photos and search them by what's actually in them.

A cloud-native photo app: every uploaded image is auto-analyzed by Google Cloud Vision, which returns content labels, so you can search 'dog', 'mountain', or 'receipt' with no manual tagging. Built to compose three Google Cloud services behind one Flask app.

Upload → Cloud Storage, analyze → Vision API labels, store metadata → Firestore, all in one pipeline.
Case-insensitive label search via a lowercased labelsLower array and Firestore array_contains.
Containerized with Docker and Gunicorn, deployed to Google Cloud Run with identity-based auth.

PythonFlaskGoogle Cloud StorageCloud Vision APIFirestoreDockerCloud Run

View code on GitHub

Machine LearningData ScienceTime Series

F1 Driver Telemetry Classifier

Identifying F1 drivers from their telemetry-derived driving style.

An ML project that analyzes Formula 1 telemetry (speed, throttle, brake, RPM, gear) to distinguish driving styles between drivers and build a predictive model that identifies a driver from a lap. Data is pulled per-lap with FastF1, distance-resampled into uniform time series, and turned into per-lap feature matrices.

Telemetry ingestion via FastF1 with local caching across Grand Prix sessions.
Distance-based resampling to align laps into comparable, uniform feature vectors.
Exploratory analysis (box/violin/Q-Q plots) comparing throttle, brake and speed distributions per driver.
Supervised classification model to attribute laps to drivers (VER, RUS, NOR, PIA).

PythonFastF1pandasscikit-learnmatplotlibseaborn

View code on GitHub

Journey

Experience & Education

MSc, Computer Science
2025 – present
University of Pannonia
Computer Science MSc.
Software Developer Trainee
Jul 2025 – Apr 2026
One Identity
Backend development on Safeguard for Privileged Sessions (SPS): implementing features, fixing bugs, extending test coverage, and participating in code reviews (Gerrit).
LLM based machine learning decision-support system
Jul – Sep 2024
University of Pannonia
Processing dummy university admissions data, data visualization, LLM-based report generation, prompt engineering, and applied machine-learning algorithms.
BSc, Computer Science (Software Design specialization)
2022 – 2025
Eötvös Loránd University (ELTE)
Computer Science BSc, Software design specialization. Thesis: an LLM-based machine-learning decision-support system.
Bilingual Secondary School
2017 – 2022
Balatonalmádi Bilingual Gymnasium
English bilingual graduation and C1 advanced English certificate.

Toolbox

Skills & Languages

Languages

PythonC#JavaSQLCypher

Databases

PostgreSQLMSSQLNeo4jPineconeFirestoreBigQuerySupabase

AI / ML

RAGLLM APIsLangGraph/LangChainSupervised MLPrompt EngineeringEmbeddings

Cloud / DevOps

Google CloudCloud RunDockerGitCI/CDGitLabJenkinsAzure DevOps

AI Agents & Tooling

Claude CodeAntigravityCodexn8nOpenclawHermes agent

Spoken languages

HungarianNative
EnglishC1 — complex
GermanB2 — complex

Interactive

Ask me anything

A retrieval-augmented chatbot grounded in my real CV and projects. It embeds your question, searches a Pinecone vector index, and answers only from what it finds — the same RAG stack I build with.

Ask my portfolio anything

RAG over my CV & projects — Pinecone + OpenAI

I'm an AI assistant grounded in János's real CV and projects. Try a question:

Abonyi JánosSoftware Developer & AI Engineer

About

Selected work

Autonomous Compliance Auditor

Digital HR Architect

Quiz Solver

PhotoVault

F1 Driver Telemetry Classifier

Journey

MSc, Computer Science

Software Developer Trainee

LLM based machine learning decision-support system

BSc, Computer Science (Software Design specialization)

Bilingual Secondary School

Toolbox

Languages

Databases

AI / ML

Cloud / DevOps

AI Agents & Tooling

Spoken languages

Interactive

Abonyi János
Software Developer & AI Engineer