bhaskartripathi/pdfChatter

Apr 9, 2023

Hey, I'm new on this website and I have no coding skills. I only got here because of your application. It's amazing. I have a suggestion (if it isn't appropriate to post it here, I'm sorry, just ignore).

My suggestion is: what if you added two optional fields to the user interface for setting the page range of the PDF that should be searched? (start on page: X; end on page: Y, then in only searches within this range instead of the entire pdf, which saves a lot of tokens). This is only a suggestion. Your app is already amazing, thank you for building it!

bhaskartripathi

Owner Apr 9, 2023

Hi sesriously,
Thank you for liking the app and providing this great suggestion. This is a very good idea to save tokens. I will make this implement it as soon as I find time and share it on this thread. Most probably on the next weekend as I find time for side projects only on the weekend.

Maup88

Apr 17, 2023

Hi, just tried out your solution. Really impressive! Relatively new to the HF platform, so sorry if I am doing this wrong. I see an use case emerge for my business, which is to use it for our machinery mechanics. By inspecting machinery manuals, helping them find the parts they need or to help them with diagnosing the issue. Is there a way to talk DM you? Would like to learn how to make best use of your tool. KR Maup

bhaskartripathi

Owner Apr 17, 2023

Hi @Maup88 - Thank you for liking the solution.
Please send me a PM on my Twitter : https://twitter.com/bt_veeru. I can share me cell no/email there.

lcl27

Apr 20, 2023

Hi bhaskartripathi,
Similarly, I am also enjoying trying your solution. I have a query with regards to data privacy - if feeding private PDFs uploaded from computer, what / who can see the data - i.e. is it only the transformer which uses it for processing and then deletes it or is the token text then used for something else or retained?

bhaskartripathi

Owner Apr 20, 2023

Hi @lcl27 ,
Technically the data is available to : 1) Google (Universal Sentence Encoder) - temporarily until the embeddings are created (for few seconds/mins) depending upon the size of embedding.; 2) OpenAI - Where the chunk of the data goes. OpenAI would use the data to improve its algorithms. If you want full data privacy the, I would suggest to use Stanford Alpaca on local machine. However, it hallicunates too much as of now and may get better in the future.

EgyptianBrince

Apr 27, 2023

How would I change the model from davinci to 3.5 turbo? I tried switching the names in the code and when I attempted to run it just stated error

andaqu

Apr 27, 2023

Setting the temperature to 0 might be a good way to reduce / omit hallucinations.

filoppo123

Apr 30, 2023

•

edited May 1, 2023

Hi @bhaskartripathi ! Same as @sesriously above, I am new here and came up to say thank you! Your solution works like a wonder.
I also had 3 questions:

[upload not working] when uploading the pdf file, I received the error [ERROR]: An unexpected error occurred: '_io.BufferedRandom' object has no attribute 'size'. Do you have any ideas on how to solve it? So far, I have been able to use it only with publicly available pdf URLs.
do you think there is a way for it to manage multiple pdfs and have in the reply the reference to the pdf on which the answer is based? (like it is now for page numbers)
considering that I need to ask for information about many pdf pages, the billing from GPT would be significant for me. Is there a way to make the script have memory of the conversation history? For instance, make it so I have to upload a given file just once, and not everytime I need to make the question.
I hope I explained myself clearly. Thank you again for this great solution!

Ausar19

May 24, 2023

This comment has been hidden

Ausar19

May 24, 2023

Hello everyone,

I recently attempted to utilize the OpenAI API for the first time, but I'm facing some challenges. I followed the necessary steps of providing my OpenAI API key, attaching a PDF file, and setting up the prompt. However, despite waiting for six hours (queue: 2718/2718 | 21907.3/21592.5s), I have not received any response. Since I lack experience with this technology, I'm uncertain if I made an error during the process.

I want to clarify that I have not exceeded the free limit of the OpenAI API. I have thoroughly checked my API key and ensured that I followed the documentation accurately. I'm curious to know if anyone else has encountered a similar issue or if there is a known cause for this problem. Any guidance or suggestions you can provide would be greatly appreciated.

I have uploaded an image to illustrate the exact situation. The seconds keep counting indefinitely, and there is no generation of a response.

Thank you in advance for your assistance!

bhaskartripathi

Owner May 25, 2023

Hi @Ausar19 - Your API key has expired as per error logs. Can you please try with a new API key

bhaskartripathi

Owner May 25, 2023

Ekuboh

May 28, 2023

•

edited May 28, 2023

hello, bhaskartripathi. are there some limits of the pdf upload in this website? like words number, pages number

bhaskartripathi

Owner May 28, 2023

No there is no limit to any pdf size. I have tested pdfs up to 1150 pages. They work fine.

Spaces:

bhaskartripathi
/

pdfChatter

Runtime error

Suggestion