@@ -26,23 +26,31 @@ is available):
26
26
27
27
docker-compose up -d --build
28
28
29
+ ## Deploy With Gradle
30
+
29
31
Then deploy a small REST API application to MarkLogic, which includes a basic non-admin MarkLogic user
30
32
named ` langchain-user ` :
31
33
32
34
./gradlew -i mlDeploy
33
35
36
+ ## Install Python Libraries
37
+
34
38
Next, create a new Python virtual environment - [ pyenv] ( https://github.com/pyenv/pyenv ) is recommended for this -
35
39
and install the
36
40
[ langchain example dependencies] ( https://python.langchain.com/docs/use_cases/question_answering/quickstart#dependencies ) ,
37
41
along with the MarkLogic Python Client:
38
42
39
43
pip install -U langchain langchain_openai langchain-community langchainhub openai chromadb bs4 marklogic_python_client
40
44
45
+ ## Load Sample Data
46
+
41
47
Then run the following Python program to load text data from the langchain quickstart guide
42
48
into two different collections in the ` langchain-test-content ` database:
43
49
44
50
python load_data.py
45
51
52
+ ## Create Python Environment File
53
+
46
54
Create a ".env" file to hold your AzureOpenAI environment values. It should look
47
55
something like this.
48
56
```
@@ -89,4 +97,53 @@ query using the `marklogic_contextual_query_retriever.py` module in this project
89
97
90
98
This retriever builds a term-query using words from the question. Then the term-query is
91
99
added to the structured query and the merged query is used to select from the documents
92
- loaded via ` load_data.py ` .
100
+ loaded via ` load_data.py ` .
101
+
102
+ ## Testing using MarkLogic 12EA Vector Search
103
+
104
+ ### MarkLogic 12EA Setup
105
+
106
+ To try out this functionality out, you will need acces to an instance of MarkLogic 12
107
+ (currently internal or Early Access only). You may use docker
108
+ [ docker-compose] ( https://docs.docker.com/compose/ ) to instantiate a new MarkLogic
109
+ instance with port 8003 available (you can use your own MarkLogic instance too, just be
110
+ sure that port 8003 is available):
111
+
112
+ docker-compose -f docker-compose-12.yml up -d --build
113
+
114
+ ### Deploy With Gradle
115
+
116
+ You will also need to deploy the application. However, for this example, you will need
117
+ to include an additional switch on the command line to deploy a TDE schema that takes
118
+ advantage of the vector capabilities in MarkLogic 12.
119
+
120
+ ./gradlew -i mlDeploy -PmlSchemasPath=src/main/ml-schemas-12
121
+
122
+ ### Install Python Libraries
123
+
124
+ As above, if you have not yet installed the Python libraries, install this with pip:
125
+ ```
126
+ pip install -U langchain langchain_openai langchain-community langchainhub openai chromadb bs4 marklogic_python_client
127
+ ```
128
+
129
+ ### Create Python Environment File
130
+ The Python script for this example also generates LLM embeddings and includes them in
131
+ the documents stored in MarkLogic. In order to generate the embeddings, you'll need to
132
+ add the following environment variables (with your values) to the .env file created
133
+ above.
134
+
135
+ ```
136
+ AZURE_EMBEDDING_DEPLOYMENT_NAME=text-test-embedding-ada-002
137
+ AZURE_EMBEDDING_DEPLOYMENT_MODEL=text-embedding-ada-002
138
+ ```
139
+
140
+ ### Load Sample Data
141
+
142
+ Then run the following Python program to load text data from the langchain quickstart
143
+ guide into two different collections in the ` langchain-test-content ` database. Note that
144
+ this script is different than the one in the earlier setup section and loads the data
145
+ into different collections.
146
+
147
+ ```
148
+ python load_data_with_embeddings.py
149
+ ```
0 commit comments