Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(anthropic): Add Anthropic PDF support (document type) in invoke #7496

Merged
merged 8 commits into from
Jan 18, 2025

Conversation

adhambadr
Copy link
Contributor

Since claude-3-5-sonnet-20240620 PDF Support has been added to Anthropic's message types.

You are able to send the PDF document to Anthropic and they do text extraction, Image conversion and supply the LLM with both (Text + Screenshot) of each page to do deep dive analysis, text extraction and more. Its pretty neat and handy especially in doing structured output and I added support for it in the Langchain Ecosystem as right now using document type throws an unsupported type error before passing it to the LLM.

I added the source type document support as well as simplifying the source object to just pass the base64 or the object exactly as in Anthropic's documentation.
I added a working example inside yarn example examples/src/prompts/pdf_document.ts

Here is an example usage:

import { ChatAnthropic } from "@langchain/anthropic";

const llm = new ChatAnthropic({
    model: "claude-3-5-sonnet-20240620",
 // Key 
});

// Local file
const file = fs.readFileSync("test.pdf");
const base64 = Buffer.from(file).toString("base64");
// Or Load remotely (web environment): 
const res = await fetch("https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf")
const buffer = await res.arrayBuffer();
const base64 = Buffer.from(buffer).toString("base64");

const prompt = "Summarize for me the contents of this document"; 
const {content} = await llm.invoke([ 
  {
     role : "user",
     content :  [
        {
          type: "text",
          text: prompt,
        },
        {
          type: "document",
          source: base64,
        }
      ]
   }
]);

console.log(content);

It's my first PR to this project so apologies if I missed something crucial, feedback or improvements are welcomed, as all as the shoutout to my twitter

Supported models as of Jan 2025:

  1. claude-3-5-sonnet-20240620
  2. claude-3-5-sonnet-20241022

Copy link

vercel bot commented Jan 10, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Jan 18, 2025 8:55am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Jan 18, 2025 8:55am

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. auto:improvement Medium size change to existing code to handle new use-cases labels Jan 10, 2025
@jacoblee93
Copy link
Collaborator

Ah nice!

Copy link
Collaborator

@jacoblee93 jacoblee93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this - see comment!

@camwardy
Copy link

Thanks for adding this @adhambadr, hopefully this can get merged soon as it'd be really useful for us!

@jacoblee93 jacoblee93 changed the title Add Anthropic PDF support (document type) in invoke feat(anthropic): Add Anthropic PDF support (document type) in invoke Jan 18, 2025
@jacoblee93 jacoblee93 merged commit 94467fa into langchain-ai:main Jan 18, 2025
33 of 34 checks passed
@jacoblee93
Copy link
Collaborator

Thank you!

@adhambadr
Copy link
Contributor Author

Thank you!

thanks a lot for the clean up (removing console, unnecessary import etc)!

FilipZmijewski added a commit to FilipZmijewski/langchainjs that referenced this pull request Jan 30, 2025
* Rename auth method in docs

* fix(core): Fix trim messages mutation bug (langchain-ai#7547)

* release(core): 0.3.31 (langchain-ai#7548)

* fix(community): Updated Embeddings URL (langchain-ai#7545)

* fix(community): make sure guardrailConfig can be added even with anthropic models (langchain-ai#7542)

* docs: Fix PGVectorStore import in install dependencies (TypeScript) example (langchain-ai#7533)

* fix(community): Airtable url (langchain-ai#7532)

* docs: Fix typo in OpenAIModerationChain example (langchain-ai#7528)

* docs: Resolves langchain-ai#7483, resolves langchain-ai#7274 (langchain-ai#7505)

Co-authored-by: jacoblee93 <[email protected]>

* docs: Rename auth method in IBM docs (langchain-ai#7524)

* docs: correct misspelling (langchain-ai#7522)

Co-authored-by: jacoblee93 <[email protected]>

* release(community): 0.3.25 (langchain-ai#7549)

* feat(azure-cosmosdb): add session context for a user mongodb (langchain-ai#7436)

Co-authored-by: jacoblee93 <[email protected]>

* release(azure-cosmosdb): 0.2.7 (langchain-ai#7550)

* fix(ci): Fix build (langchain-ai#7551)

* feat(anthropic): Add Anthropic PDF support (document type) in invoke (langchain-ai#7496)

Co-authored-by: jacoblee93 <[email protected]>

* release(anthropic): 0.3.12 (langchain-ai#7552)

* chore(core,langchain,community): Relax langsmith deps (langchain-ai#7556)

* release(community): 0.3.26 (langchain-ai#7557)

* release(core): 0.3.32 (langchain-ai#7558)

* Release 0.3.12 (langchain-ai#7559)

* Add deployment chat to chat class

* Upadate Watsonx sdk

* Rework interfaces in llms as well

* Bump watsonx-ai sdk version

* Remove unused code

* Add fake auth

---------

Co-authored-by: Jacob Lee <[email protected]>
Co-authored-by: Jacky Chen <[email protected]>
Co-authored-by: Mohamed Belhadj <[email protected]>
Co-authored-by: Brian Ploetz <[email protected]>
Co-authored-by: Eduard-Constantin Ibinceanu <[email protected]>
Co-authored-by: Jonathan V <[email protected]>
Co-authored-by: ucev <[email protected]>
Co-authored-by: crisjy <[email protected]>
Co-authored-by: Adham Badr <[email protected]>
FilipZmijewski added a commit to FilipZmijewski/langchainjs that referenced this pull request Jan 30, 2025
* Rename auth method in docs

* fix(core): Fix trim messages mutation bug (langchain-ai#7547)

* release(core): 0.3.31 (langchain-ai#7548)

* fix(community): Updated Embeddings URL (langchain-ai#7545)

* fix(community): make sure guardrailConfig can be added even with anthropic models (langchain-ai#7542)

* docs: Fix PGVectorStore import in install dependencies (TypeScript) example (langchain-ai#7533)

* fix(community): Airtable url (langchain-ai#7532)

* docs: Fix typo in OpenAIModerationChain example (langchain-ai#7528)

* docs: Resolves langchain-ai#7483, resolves langchain-ai#7274 (langchain-ai#7505)

Co-authored-by: jacoblee93 <[email protected]>

* docs: Rename auth method in IBM docs (langchain-ai#7524)

* docs: correct misspelling (langchain-ai#7522)

Co-authored-by: jacoblee93 <[email protected]>

* release(community): 0.3.25 (langchain-ai#7549)

* feat(azure-cosmosdb): add session context for a user mongodb (langchain-ai#7436)

Co-authored-by: jacoblee93 <[email protected]>

* release(azure-cosmosdb): 0.2.7 (langchain-ai#7550)

* fix(ci): Fix build (langchain-ai#7551)

* feat(anthropic): Add Anthropic PDF support (document type) in invoke (langchain-ai#7496)

Co-authored-by: jacoblee93 <[email protected]>

* release(anthropic): 0.3.12 (langchain-ai#7552)

* chore(core,langchain,community): Relax langsmith deps (langchain-ai#7556)

* release(community): 0.3.26 (langchain-ai#7557)

* release(core): 0.3.32 (langchain-ai#7558)

* Release 0.3.12 (langchain-ai#7559)

* fix(core): Prevent cache misses from triggering model start callback runs twice (langchain-ai#7565)

* fix(core): Ensure that cached flag in run extras is only set for cache hits (langchain-ai#7566)

* release(core): 0.3.33 (langchain-ai#7567)

* feat(community): Adds graph_document to export list (langchain-ai#7555)

Co-authored-by: quantropi-minh <[email protected]>
Co-authored-by: jacoblee93 <[email protected]>

* fix(langchain): Fix ZeroShotAgent createPrompt with correct formatted tool names (langchain-ai#7510)

* docs: Add document for AzureCosmosDBMongoChatMessageHistory (langchain-ai#7519)

Co-authored-by: root <root@CPC-yangq-FRSGK>

* fix(langchain): Allow pulling hub prompts with associated models (langchain-ai#7569)

* fix(community,aws): Update handleLLMNewToken to include chunk metadata (langchain-ai#7568)

Co-authored-by: jacoblee93 <[email protected]>

* feat(community): Provide fallback relationshipType in case it is not present in graph_transformer (langchain-ai#7521)

Co-authored-by: quantropi-minh <[email protected]>
Co-authored-by: jacoblee93 <[email protected]>

* docs: Add redirect (langchain-ai#7570)

* fix(langchain,core): Add shim for hub mustache templates with nested input variables (langchain-ai#7581)

* fix(chat-models): honor disableStreaming even for `generateUncached` (langchain-ai#7575)

* release(core): 0.3.34 (langchain-ai#7584)

* feat(langchain): Add hub entrypoint with automatic dynamic entrypoint of models (langchain-ai#7583)

* chore(ollama): Export `OllamaEmbeddingsParams` interface (langchain-ai#7574)

* docs: Clarify tool creation process in structured outputs documentation (langchain-ai#7578)

Co-authored-by: Sahar Shemesh <[email protected]>
Co-authored-by: jacoblee93 <[email protected]>

* fix(community): Set awaitHandlers to true in upstash ratelimit (langchain-ai#7571)

Co-authored-by: Jacob Lee <[email protected]>

* fix(core): Fix trim messages mutation (langchain-ai#7585)

* feat(openai): Make only AzureOpenAI respect Azure env vars, remove class defaults, update withStructuredOutput defaults (langchain-ai#7535)

* fix(community): Make postgresConnectionOptions optional in PostgresRecordManager (langchain-ai#7580)

Co-authored-by: jacoblee93 <[email protected]>

* release(community): 0.3.27 (langchain-ai#7586)

* release(ollama): 0.1.5 (langchain-ai#7587)

* Release 0.3.13 (langchain-ai#7588)

* release(openai): 0.4.0 (langchain-ai#7589)

* release(core): 0.3.35 (langchain-ai#7590)

* fix(ci): Update lock (langchain-ai#7591)

* feat(core): Allow passing returnDirect in tool wrapper params (langchain-ai#7594)

* release(core): 0.3.36 (langchain-ai#7595)

* fix(openai): Revert Azure default withStructuredOutput changes (langchain-ai#7596)

* release(openai): 0.4.1 (langchain-ai#7597)

* feat(openai): Refactor to allow easier subclassing (langchain-ai#7598)

* release(openai): 0.4.2 (langchain-ai#7599)

* feat(deepseek): Adds Deepseek integration (langchain-ai#7604)

* release(deepseek): 0.0.1 (langchain-ai#7608)

* feat: update Novita AI doc (langchain-ai#7602)

* Add deployment chat to chat class

* feat(langchain): Add DeepSeek to initChatModel (langchain-ai#7609)

* Release 0.3.14 (langchain-ai#7611)

* fix: Add test for pdf uploads anthropic (langchain-ai#7613)

* feat: Update google genai to support file uploads (langchain-ai#7612)

* chore(google-genai): Drop .only in test (langchain-ai#7614)

* release(google-genai): 0.1.7 (langchain-ai#7615)

* Upadate Watsonx sdk

* fix(core): Fix stream events bug when errors are thrown too quickly during iteration (langchain-ai#7617)

* release(core): 0.3.37 (langchain-ai#7619)

* fix(langchain): Fix Groq import for hub (langchain-ai#7620)

* docs: update README/intro

* Release 0.3.15

* feat(community): improve support for Tavily search tool args (langchain-ai#7561)

* feat(community): Add boolean metadata type support in Supabase structured query translator (langchain-ai#7601)

* feat(google-genai): Add support for fileUri in media type in Google GenAI (langchain-ai#7621)

Co-authored-by: Jacob Lee <[email protected]>

* release(google-genai): 0.1.8 (langchain-ai#7628)

* release(community): 0.3.28 (langchain-ai#7629)

* Rework interfaces in llms as well

* Bump watsonx-ai sdk version

* Remove unused code

* Add fake auth

* Fix broken changes

---------

Co-authored-by: Jacob Lee <[email protected]>
Co-authored-by: Jacky Chen <[email protected]>
Co-authored-by: Mohamed Belhadj <[email protected]>
Co-authored-by: Brian Ploetz <[email protected]>
Co-authored-by: Eduard-Constantin Ibinceanu <[email protected]>
Co-authored-by: Jonathan V <[email protected]>
Co-authored-by: ucev <[email protected]>
Co-authored-by: crisjy <[email protected]>
Co-authored-by: Adham Badr <[email protected]>
Co-authored-by: Minh Ha <[email protected]>
Co-authored-by: quantropi-minh <[email protected]>
Co-authored-by: Chi Thu Le <[email protected]>
Co-authored-by: fatmelon <[email protected]>
Co-authored-by: root <root@CPC-yangq-FRSGK>
Co-authored-by: Mohamad Mohebifar <[email protected]>
Co-authored-by: David Duong <[email protected]>
Co-authored-by: Brace Sproul <[email protected]>
Co-authored-by: Matus Gura <[email protected]>
Co-authored-by: Sahar Shemesh <[email protected]>
Co-authored-by: Sahar Shemesh <[email protected]>
Co-authored-by: Cahid Arda Öz <[email protected]>
Co-authored-by: Jason <[email protected]>
Co-authored-by: vbarda <[email protected]>
Co-authored-by: Vadym Barda <[email protected]>
Co-authored-by: Hugo Borsoni <[email protected]>
Co-authored-by: Arman Ghazaryan <[email protected]>
Co-authored-by: Andy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants