Published in:

AI Technical Lead

Careers | AI Technical Lead

Join Us and Make an Impact

What you need to know about the job

Job Requirements

5+ years Python & C#, including building REST/GraphQL APIs and LangChain or SK pipelines

Hands-on with Ollama: installing, running models ≥ 7B, exposing REST endpoints, and integrating via SDKs

RAG & Vector Stores: designing embeddings flow, chunking, and caching for local‐first retrieval

GPU/Hardware optimization: quantization, batching, NUMA awareness; able to articulate minimum specs for 7–13B models

Responsibilities

Package, pull, and tune Llama-class models with Ollama’s CLI & REST API; manage model files and versioning

Profile CPU/GPU utilization, quantization levels (4-bit, Q K), VRAM needs, and Node feature gates

Build RAG pipelines that ground Ollama models with Azure AI Search (vector + hybrid) and local embeddings

Compose cross-boundary workflows: SK Ollama connector for local inference, SK Azure OpenAI connector for cloud scale; implement planners and memory stores

Benchmark token latency vs. cloud; decide burst thresholds; recommend hardware upgrades

Train engineers on LangChain-Ollama patterns, SK plugin authoring, and hybrid deployment trade-offs

AI & Analytics

Internet of Things

Enterprise Products

Featured Product

Experience

Information Management

Services

Featured Solution