Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSS Docs / Examples epic #1224

Open
6 of 17 tasks
AyushExel opened this issue Apr 17, 2024 · 1 comment
Open
6 of 17 tasks

OSS Docs / Examples epic #1224

AyushExel opened this issue Apr 17, 2024 · 1 comment
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@AyushExel
Copy link
Contributor

AyushExel commented Apr 17, 2024

Description

Experiments

  • Diffusion training
  • Saving/Loading model weight chunks using pylance
    • Plugin for vanilla pytorch with features to load, save and version models
    • Support for HF accelerate
    • Support for Deepspeed for transformer inference
    • Support for manual sharding based on a device map file

Guides:

  • Fine-tuning embedding models
  • Improve retrieval systems

Undocumented Features :

  • Pydantic optional vector field
  • Huggingface integratoin

Doc improvement tips based on onboarding feedback :

Concepts

  • What is the preferred way to ingest data
    - We support various ways to ingest data - pylist, pydict, arrow, recordbatch, pydantic model list.~
    - We don't have an official preferred format.~

  • What is the preferred way to define LanceDB table schema**~
    - We allow defining schema explicitly via pyarrow or Pydantic. But we don't say which is preferred ~
    - Pydantic is required for using EMbedding API, which is what we want the users to use ~
    Better document drop_table

  • Enrich integrations pages with examples:

    • Langchain
    • Llama-index

Docs typos/bug

-

Links

@AyushExel AyushExel added the documentation Improvements or additions to documentation label Apr 17, 2024
@raghavdixit99 raghavdixit99 self-assigned this May 13, 2024
@AyushExel AyushExel changed the title OSS Docs epic OSS Docs / Examples epic May 20, 2024
@raghavdixit99
Copy link
Contributor

Llama index todo this sprint.

AyushExel added a commit that referenced this issue Jun 8, 2024
…cement (#1326)

- Tried to address some onboarding feedbacks listed in
#1224
- Improve visibility of pydantic integration and embedding API. (Based
on onboarding feedback - Many ways of ingesting data, defining schema
but not sure what to use in a specific use-case)
- Add a guide that takes users through testing and improving retriever
performance using built-in utilities like hybrid-search and reranking
- Add some benchmarks for the above
- Add missing cohere docs

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants