Skip to content

Commit

Permalink
Add new blog draft and update references for GenAIScript samples
Browse files Browse the repository at this point in the history
  • Loading branch information
pelikhan committed Sep 19, 2024
1 parent 6147a4d commit 66183ea
Show file tree
Hide file tree
Showing 4 changed files with 279 additions and 4 deletions.
148 changes: 148 additions & 0 deletions docs/src/content/docs/blog/drafts/search-transform-genai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
---
title: "Search and Transform with GenAI"
date: 2024-09-19
authors: genaiscript
draft: true
tags: ["genai", "scripting", "automation"]
canonical_url: https://microsoft.github.io/genaiscript/blog/search-transform-genai
---

## Introduction

Have you ever found yourself in a situation where you need to search through multiple files in your project, find a specific pattern, and then apply a transformation to it? It can be a tedious task, but fear not! In this blog post, I'll walk you through a GenAIScript that does just that, automating the process and saving you time. 🕒💡

The script we'll be examining is a powerful tool that can search for text patterns in files using regular expressions and apply a transformation to each match using language models. We will look at the script step-by-step to understand how it works. Let's get started!

## The Script Explained

First things first, you can find the script on GitHub at [st.genai.mts](https://github.com/microsoft/genaiscript/blob/main/packages/vscode/genaisrc/st.genai.mts). Now, let's break down the script.

### Setting Up the Script

```ts
script({
title: "Search and transform",
description:
"Search for a pattern in files and apply a LLM transformation the match",
parameters: {
glob: {
type: "string",
description: "The glob pattern to filter files",
default: "*",
},
pattern: {
type: "string",
description: "The text pattern (regular expression) to search for",
},
transform: {
type: "string",
description: "The LLM transformation to apply to the match",
},
},
})
```

The script starts by defining its purpose and parameters using the `script` function. Here, we define the title, description, and the three parameters the script will need: `glob` to specify the files, `pattern` for the text to search for, and `transform` for the desired transformation.

### Extracting and Validating Parameters

```ts
const { pattern, glob, transform } = env.vars
if (!pattern) cancel("pattern is missing")
const patternRx = new RegExp(pattern, "g")

if (!transform) cancel("transform is missing")
```

Next, we extract the `pattern`, `glob`, and `transform` parameters from the environment variables and validate them. If `pattern` or `transform` are missing, the script will cancel execution. We then compile the `pattern` into a regular expression object for later use.

### Searching for Files and Matches

```ts
const { files } = await workspace.grep(patternRx, glob)
```

Here, we use the `grep` function from the `workspace` API to search for files that match the `glob` pattern and contain the regex pattern.

### Transforming Matches

```ts
// cached computed transformations
const patches = {}
for (const file of files) {
console.log(file.filename)
const { content } = await workspace.readText(file.filename)
// skip binary files
if (!content) continue
// compute transforms
for (const match of content.matchAll(patternRx)) {
console.log(` ${match[0]}`)
if (patches[match[0]]) continue
```
We initialize an object called `patches` to store the transformations. Then, we loop through each file, read its content, and skip binary files. For each match found in the file's content, we check if we've already computed a transformation for this match to avoid redundant work.
### Generating Prompts for Transformations
```ts
const res = await runPrompt(
(_) => {
_.$`
## Task
Your task is to transform the MATCH with the following TRANSFORM.
Return the transformed text.
- do NOT add enclosing quotes.
## Context
`
_.def("MATCHED", match[0])
_.def("TRANSFORM", transform)
},
{ label: match[0], system: [], cache: "search-and-transform" }
)
```
For each unique match, we generate a prompt using the `runPrompt` function. In the prompt, we define the task and context for the transformation, specifying that the transformed text should be returned without enclosing quotes. We also define the matched text and the transformation to apply.
### Applying the Transformation
```ts
const transformed = res.fences?.[0].content ?? res.text
if (transformed) patches[match[0]] = transformed
console.log(` ${match[0]} -> ${transformed ?? "?"}`)
}
// apply transforms
const newContent = content.replace(
patternRx,
(match) => patches[match] ?? match
)
```
We then extract the transformed text from the prompt result and store it in the `patches` object. Finally, we apply the transformations to the file content using `String.prototype.replace`.
### Saving the Changes
```ts
if (content !== newContent)
await workspace.writeText(file.filename, newContent)
}
```

If the file content has changed after applying the transformations, we save the updated content back to the file.

## Running the Script

To run this script, you'll need the GenAIScript CLI. Check out the [installation guide](https://microsoft.github.io/genaiscript/getting-started/installation) if you need to set it up. Once you have the CLI, run the script by executing:

```bash
genaiscript run st
```

Be sure to replace `st` with the actual name of your script if you've named it differently.

## Conclusion

This script is a fantastic example of how GenAIScript can simplify complex tasks like searching and transforming text across multiple files. By following the steps outlined, you can create your own scripts to automate your workflows and boost your productivity. Happy scripting! 🚀

Remember to check out the script on GitHub and tweak it to suit your needs. If you have any questions or want to share your own scripting experiences, feel free to leave a comment below!
3 changes: 1 addition & 2 deletions docs/src/content/docs/reference/vscode/samples/iat.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ Or perhaps you're aiming to make your content more accessible by providing descr

In this sample, we'll dive into a GenAIScript tool that automates the generation of alt text for images in Markdown files,
making your content more inclusive and SEO-friendly!
🚀 Check out the [script on GitHub](https://github.com/microsoft/genaiscript/blob/main/packages/vscode/genaisrc/iat.genai.mts).

### 👩‍💻 Understanding the Script Code

Expand Down Expand Up @@ -128,6 +127,6 @@ That's it! You now have the power to enhance your Markdown files with descriptiv

Happy coding! 🚀

## Full source
## Full source ([GitHub](https://github.com/microsoft/genaiscript/blob/main/packages/vscode/genaisrc/iat.genai.mts))

<Code code={source} wrap={true} lang="ts" title="iat.genai.mts" />
130 changes: 129 additions & 1 deletion docs/src/content/docs/reference/vscode/samples/st.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,134 @@ description: Search for a pattern in files and apply a LLM transformation the ma
import { Code } from "@astrojs/starlight/components"
import source from "../../../../../../../packages/vscode/genaisrc/st.genai.mts?raw"

## st
Search And Replace is a powerful tool in the developer toolbelt that can save you time and effort...
if you can forumlate the right regular expression.

**Search and Transform** is a twist on the same concept
but we use a LLM to perform the transformation instead of a simple string replacement.

### 👩‍💻 Understanding the Script Code

```ts
script({
title: "Search and transform",
description:
"Search for a pattern in files and apply a LLM transformation the match",
parameters: {
glob: {
type: "string",
description: "The glob pattern to filter files",
default: "*",
},
pattern: {
type: "string",
description: "The text pattern (regular expression) to search for",
},
transform: {
type: "string",
description: "The LLM transformation to apply to the match",
},
},
})
```

The script starts by defining its purpose and parameters using the `script` function. Here, we define the title, description, and the three parameters the script will need: `glob` to specify the files, `pattern` for the text to search for, and `transform` for the desired transformation.

### Extracting and Validating Parameters

```ts
const { pattern, glob, transform } = env.vars
if (!pattern) cancel("pattern is missing")
const patternRx = new RegExp(pattern, "g")

if (!transform) cancel("transform is missing")
```

Next, we extract the `pattern`, `glob`, and `transform` parameters from the environment variables and validate them. If `pattern` or `transform` are missing, the script will cancel execution. We then compile the `pattern` into a regular expression object for later use.

### Searching for Files and Matches

```ts
const { files } = await workspace.grep(patternRx, glob)
```

Here, we use the `grep` function from the `workspace` API to search for files that match the `glob` pattern and contain the regex pattern.

### Transforming Matches

```ts
// cached computed transformations
const patches = {}
for (const file of files) {
console.log(file.filename)
const { content } = await workspace.readText(file.filename)
// skip binary files
if (!content) continue
// compute transforms
for (const match of content.matchAll(patternRx)) {
console.log(` ${match[0]}`)
if (patches[match[0]]) continue
```
We initialize an object called `patches` to store the transformations. Then, we loop through each file, read its content, and skip binary files. For each match found in the file's content, we check if we've already computed a transformation for this match to avoid redundant work.
### Generating Prompts for Transformations
```ts
const res = await runPrompt(
(_) => {
_.$`
## Task
Your task is to transform the MATCH with the following TRANSFORM.
Return the transformed text.
- do NOT add enclosing quotes.
## Context
`
_.def("MATCHED", match[0])
_.def("TRANSFORM", transform)
},
{ label: match[0], system: [], cache: "search-and-transform" }
)
```
For each unique match, we generate a prompt using the `runPrompt` function. In the prompt, we define the task and context for the transformation, specifying that the transformed text should be returned without enclosing quotes. We also define the matched text and the transformation to apply.
### Applying the Transformation
```ts
const transformed = res.fences?.[0].content ?? res.text
if (transformed) patches[match[0]] = transformed
console.log(` ${match[0]} -> ${transformed ?? "?"}`)
}
// apply transforms
const newContent = content.replace(
patternRx,
(match) => patches[match] ?? match
)
```
We then extract the transformed text from the prompt result and store it in the `patches` object. Finally, we apply the transformations to the file content using `String.prototype.replace`.
### Saving the Changes
```ts
if (content !== newContent)
await workspace.writeText(file.filename, newContent)
}
```

If the file content has changed after applying the transformations, we save the updated content back to the file.

## Running the Script

To run this script, you'll need the GenAIScript CLI. Check out the [installation guide](https://microsoft.github.io/genaiscript/getting-started/installation) if you need to set it up. Once you have the CLI, run the script by executing:

```bash
genaiscript run st
```

## Full source ([GitHub](https://github.com/microsoft/genaiscript/blob/main/packages/vscode/genaisrc/st.genai.mts))

<Code code={source} wrap={true} lang="ts" title="st.genai.mts" />
2 changes: 1 addition & 1 deletion packages/sample/genaisrc/blogify-sample.genai.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ $`Create a blog post that explains the GenAIScript source code in FILE.
- save generate text to blog post drafts folder
- ignore existing blog posts
- avoid using "delve"
- do NOT explain comments
- do not repeat or explain the script file top comment
- include canonical_url to genaiscript blog at https://microsoft.github.io/genaiscript/blog/<topic>
- use language "ts" for mts snippets and "js" for mjs snippets
- use markdown headers starting from level 2
Expand Down

0 comments on commit 66183ea

Please sign in to comment.