Назад към всички

Weaviate

// Build vector search with Weaviate using v4 syntax, proper module configuration, and production-ready patterns.

$ git log --oneline --stat
stars:1,933
forks:367
updated:March 4, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
nameWeaviate
slugweaviate
version1.0.0
descriptionBuild vector search with Weaviate using v4 syntax, proper module configuration, and production-ready patterns.
metadata[object Object]

Critical: v4 Only (Dec 2024+)

v3 syntax is DEPRECATED. Before generating ANY Weaviate code:

  1. Verify client version — must be weaviate-client>=4.0
  2. Use context managerswith weaviate.connect_to_*() as client: or explicit client.close()
  3. New importsfrom weaviate.classes.config import Configure, Property

If you see v3 patterns (weaviate.Client(), client.schema.create_class(), path=[...] filters), stop and rewrite.

Quick Reference

TopicFile
v3→v4 migration tablev4-syntax.md
Module configurationmodules.md
Batch, hybrid, HNSWoperations.md

v4 Syntax Essentials

# Connection (ALWAYS close)
with weaviate.connect_to_local() as client:
    # Collections (not classes)
    collection = client.collections.get("Article")
    
    # Queries
    response = collection.query.hybrid("search term", alpha=0.7)
    
    # Vector access
    vector = obj.vector["default"]  # Dict, not List
    
    # Filters
    Filter.by_property("category").equal("tech")

Scope

This skill covers:

  • Schema design for RAG and semantic search
  • Vectorizer and reranker module configuration
  • Batch imports with error handling
  • Hybrid search tuning (alpha parameter)
  • HNSW index configuration for scale

Core Rules

1. Always Verify Modules

Before using text2vec-openai, generative-openai, or rerankers, verify they're enabled:

# docker-compose.yml
ENABLE_MODULES: 'text2vec-openai,generative-openai,reranker-cohere'

2. API Keys in Headers

client = weaviate.connect_to_local(
    headers={"X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"]}
)

3. Batch with Context Manager

with client.batch.dynamic() as batch:
    for item in data:
        batch.add_object(properties=item, collection="Name")

4. Hybrid Search Alpha

  • alpha=0 → BM25 only (keyword)
  • alpha=1 → Vector only (semantic)
  • alpha=0.5-0.75 → Balanced (typical for RAG)

5. Apply Filters BEFORE Vector Search

Filters in where reduce the search space first — always filter before near_text/near_vector.

6. Named Vectors vs Single Vector

Choose one pattern per collection:

# Single vector (simpler)
vectorizer_config=Configure.Vectorizer.text2vec_openai()

# Named vectors (multiple embeddings per object)
vector_config=[
    Configure.Vectors.text2vec_openai(name="content", source_properties=["body"]),
]

7. Debug Empty Results

Check in order: schema exists → vectorizer ran → distance threshold → filter syntax. Use _additional { vector } to verify vectors were generated.