OCR TEXT CAPTURE AND OPENAI API INTEGRATION WITH AUTO-CAPTURE
Descrição da oferta de emprego
Develop a Windows app with the following functionality.
Allow the user to define a screenshot area by clicking the start and end points on the screen.
Capture the selected area as an image, extract text using OCR, and save both the extracted text and user-defined prompt to a log file.
Send the OCR text to the OpenAI API along with the prompt, receive a response, and display it in the app.
Repeat this process for each new capture, clearing previous data.
Functional Requirements Prompt Configuration.
The app should include a user interface (UI) with an input field where users can enter a prompt to guide the OpenAI API’s response based on the captured text.
If the user leaves the field empty, the app should use a default prompt.
Start and End Points for Capture.
When the user clicks a "Capture & Process" button, the app should prompt them to define a rectangular area on the screen by clicking twice.
First click.
Defines the start point (top-left corner).
Second click.
Defines the end point (bottom-right corner).
These coordinates will define the bounding box for capturing the screenshot.
Screenshot Capture.
After defining the start and end points, the app should capture the selected screen area.
The main app window should temporarily minimize or hide during this process for a clearer selection view, then restore once the points are set.
OCR Processing.
Use OCR to extract text from the captured image.
Display the extracted text in the app’s UI for confirmation and reference.
Save Text and Prompt to File.
Append each OCR result along with the associated prompt to a text file ([login to view URL]) for record-keeping.
The file should log each new capture in a clear, separated format, for example.
OCR Text.
Shows the extracted text.
Prompt.
Shows the user-defined prompt or the default one.
Use headers and footers (e.
., ---- New Capture ----) to separate each capture in the file.
OpenAI API Request.
Send the OCR text along with the prompt to the OpenAI API.
Display the API’s response in the app’s UI, allowing the user to see the outcome directly.
Repeatable Process.
Every time the "Capture & Process" button is clicked, the app should.
Clear any previous output from the display area.
Prompt the user to define a new capture area.
Reset the process, allowing for a new OCR extraction and OpenAI API interaction.
Detailed Steps for Development 1.
Set Up GUI Elements.
Create a tkinter interface with.
An input field for the prompt.
A "Capture & Process" button to initiate the selection and processing.
A text display area to show the OCR result and the OpenAI response.
2.
Handle Start and End Point Selection.
When the user clicks "Capture & Process".
Minimize or hide the main app window temporarily.
Bind left-click to set the start point (x1, y1) and right-click to set the end point (x2, y2).
After both points are set, unbind the click events and restore the main app window.
3.
Capture Screenshot from Selected Area.
Use the defined start and end coordinates to create a bounding box.
Capture the rectangular screen area within this bounding box as an image.
4.
Apply OCR to Extract Text.
Convert the captured image to text using OCR.
Display the OCR output in the app’s text display area so the user can view the extracted content.
5.
Append to Log File.
Open (or create if it doesn’t exist) [login to view URL] in append mode.
Save the OCR-extracted text and the user-defined prompt in a structured format.
Separate each entry with a clear header (---- New Capture ----) and footer.
Log the text and prompt on separate lines within each capture entry.
6.
Send Data to OpenAI API.
Send a request to the OpenAI API using the OCR text and the user-defined prompt as input.
Configure the API to receive a response based on the prompt’s instructions.
7.
Display OpenAI API Response.
Once a response is received, display it in the text area within the app.
Clear previous data from the display each time a new capture is initiated to ensure only relevant, current data is shown.
8.
Error Handling.
Ensure robust error handling for cases such as.
Failed OCR or invalid image data.
API errors (e.
., no response or connection issues).
Display user-friendly error messages if issues occur during OCR or API calls.
9.
Testing.
Thoroughly test each part of the process.
Ensure coordinates are correctly set by start and end points.
Validate that each capture correctly logs the text and prompt to the file.
Check that the OpenAI response accurately displays based on the prompt and OCR input.
Summary of Repeatable Workflow User clicks "Capture & Process".
User clicks to define the start point and then the end point for the area to capture.
The app captures the selected area, applies OCR, displays the text, logs it to [login to view URL], and sends it to OpenAI.
The app displays the API response in the UI.
The user can initiate a new capture by clicking "Capture & Process" again, with all previous data cleared for the next cycle.
Python ID do Projeto.
# Sobre o projeto 18 propostas Aberto para ofertas Projeto remoto Ativo em 33 minutos atrás
Detalhes da oferta
- Indeterminado
- Em todo Portugal
- Indeterminado - Indeterminado
- 04/11/2024
- 02/02/2025
Operates in 54 countries offering property, personal and business insurance, as well as accident, supplementary health, reinsurance and life insurance... we are looking for french and english speaking employees for our team in lisbon to support our customers (inbound calls, emails and chat) project starts......
Familiarity with plc programming and industrial automation protocols... young graduate with a strong desire for designing and implementing automation and robotics solutions... strong communication and collaboration skills... excellent problem-solving and troubleshooting skills... collaborate with cross-functional......
Requisitos do trabalho requirements: mandatory proficiency in english and dutch strong client-facing and communication skills customer service orientation available to work in fixed schedules role purpose: provide first level contact and convey resolutions to customer issues properly escalate unresolved......
Requisitos do trabalho mandatory proficiency in english and german strong client-facing and communication skills customer service orientation available to work in fixed schedules role purpose: provide first level contact and convey resolutions to customer issues properly escalate unresolved queries to......
Requisitos do trabalho mandatory proficiency in english and german strong client-facing and communication skills customer service orientation available to work in fixed schedules role purpose: provide first level contact and convey resolutions to customer issues properly escalate unresolved queries to......
Requisitos do trabalho requirements: mandatory proficiency in english and dutch strong client-facing and communication skills customer service orientation available to work in fixed schedules role purpose: provide first level contact and convey resolutions to customer issues properly escalate unresolved......
Requisitos do trabalho mandatory proficiency in french and english strong client-facing and communication skills customer service orientation available to work in fixed schedules role purpose: provide first level contact and convey resolutions to customer issues properly escalate unresolved queries to......
Requisitos do trabalho mandatory proficiency in english and dutch strong client-facing and communication skills customer service orientation available to work in fixed schedules role purpose: provide first level contact and convey resolutions to customer issues properly escalate unresolved queries to......
Requisitos do trabalho requirements: mandatory proficiency in english and dutch strong client-facing and communication skills customer service orientation available to work in fixed schedules role purpose: provide first level contact and convey resolutions to customer issues properly escalate unresolved......
Fluent in german and english... localize legal pages to comply with portuguese laws and regulations... adapt surveys and questionnaires for portuguese-speaking audiences, ensuring cultural relevance and clarity... marketing materials:- translate brochures and other marketing materials from de>pt and......