File size: 4,727 Bytes
8329090
3ccc981
8329090
 
 
 
 
 
 
 
 
2846658
8329090
2846658
 
 
8329090
2846658
 
8329090
 
 
2846658
8329090
2846658
8329090
2846658
 
8329090
2846658
 
8329090
2846658
8329090
2846658
8329090
2846658
 
 
8329090
 
2846658
 
 
 
 
 
 
 
8329090
2846658
 
 
 
8329090
 
2846658
8329090
 
 
2846658
 
 
 
8329090
2846658
 
 
8329090
 
 
 
 
 
 
 
 
 
2846658
 
 
 
 
 
 
8329090
 
 
 
 
2846658
8329090
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
---
title: Document Insights - Extractive & Generative Methods
emoji: πŸ‘‘
colorFrom: indigo
colorTo: indigo
sdk: streamlit
sdk_version: 1.23.0
app_file: app.py
pinned: false
---

# Document Insights - Extractive & Generative Methods using Haystack

This template [Streamlit](https://docs.streamlit.io/) app set up for
simple [Haystack search applications](https://docs.haystack.deepset.ai/docs/semantic_search). The template is ready to
do QA with **Retrievel Augmented Generation**, or **Ectractive QA**

Below you will also find instructions on how you
could [push this to Hugging Face Spaces πŸ€—](#pushing-to-hugging-face-spaces-).

## Installation and Running

### Local development

To run the bare application which does _nothing_:

1. Install requirements: `pip install -r requirements.txt`
2. Run the streamlit app: `streamlit run app.py`

This will start up the app on `localhost:8501` where you will find a simple search bar. Before you start editing, you'll
notice that the app will only show you instructions on what to edit.

### Docker

To run the app in a Docker container:

1. Build the Docker image: `docker build -t haystack-streamlit .`
2. Run the Docker container: `docker run -p 8501:8501 haystack-streamlit` (make sure to bind any other ports you need)
3. Open your browser and go to `http://localhost:8501`

### Repo structure

- `./utils`: This is where we have 3 files:
    - `config.py`: This file extracts all of the configuration settings from a `.env` file. For some config settings, it
      uses default values. An example of this is
      in [this demo project](https://github.com/TuanaCelik/should-i-follow/blob/main/utils/config.py).
    - `haystack.py`: Here you will find some functions already set up for you to start creating your Haystack search
      pipeline. It includes 2 main functions called `start_haystack()` which is what we use to create a pipeline and
      cache it, and `query()` which is the function called by `app.py` once a user query is received.
    - `ui.py`: Use this file for any UI and initial value setups.
- `app.py`: This is the main Streamlit application file that we will run. In its current state it has a simple search
  bar, a 'Run' button, and a response that you can highlight answers with.
- `requirements.txt`: This file includes the required libraries to run the Streamlit app.
- `document_qa_engine.py`: This file includes the QA pipeline with Haystack.

### What to edit?

There are default pipelines both in `start_haystack_extractive()` and `start_haystack_rag()`

- Change the pipelines to use the embedding models, extractive or generative models as you need.
- If using the `rag` task, change the `default_prompt_template` to use one of our available ones
  on [PromptHub](https://prompthub.deepset.ai) or create your own `PromptTemplate`

### Using local LLM models

To use the `local LLM` mode you can use [LM Studio](https://lmstudio.ai/) or [Ollama](https://ollama.com/).
For more info on how to run the app with a local LLM model please refer to the documentation of the tool you are using.
The `local_llm` mode expects an API available at `http://localhost:1234/v1`.

## Pushing to Hugging Face Spaces πŸ€—

Below is an example GitHub action that will let you push your Streamlit app straight to the Hugging Face Hub as a Space.

A few things to pay attention to:

1. Create a New Space on Hugging Face with the Streamlit SDK.
2. Create a Hugging Face token on your HF account.
3. Create a secret on your GitHub repo called `HF_TOKEN` and put your Hugging Face token here.
4. If you're using DocumentStores or APIs that require some keys/tokens, make sure these are provided as a secret for
   your HF Space too!
5. This readme is set up to tell HF spaces that it's using streamlit and that the app is running on `app.py`, make any
   changes to the frontmatter of this readme to display the title, emoji etc you desire.
6. Create a file in `.github/workflows/hf_sync.yml`. Here's an example that you can change with your own information,
   and an [example workflow](https://github.com/TuanaCelik/should-i-follow/blob/main/.github/workflows/hf_sync.yml)
   working for the [Should I Follow demo](https://huggingface.co/spaces/deepset/should-i-follow)

```yaml
name: Sync to Hugging Face hub
on:
  push:
    branches: [ main ]

  # to run this workflow manually from the Actions tab
  workflow_dispatch:

jobs:
  sync-to-hub:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 0
          lfs: true
      - name: Push to hub
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: git push --force https://{YOUR_HF_USERNAME}:$HF_TOKEN@{YOUR_HF_SPACE_REPO} main
```