Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Studios getting started guide #479

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from
Draft

Studios getting started guide #479

wants to merge 17 commits into from

Conversation

llewellyn-sl
Copy link
Contributor

No description provided.

@llewellyn-sl llewellyn-sl self-assigned this Feb 18, 2025
Copy link

netlify bot commented Feb 18, 2025

Deploy Preview for seqera-docs ready!

Name Link
🔨 Latest commit 5deef93
🔍 Latest deploy log https://app.netlify.com/sites/seqera-docs/deploys/67b824c318f79600098db56f
😎 Deploy Preview https://deploy-preview-479--seqera-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@robnewman robnewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! Super close - just some consistency issues and formatting. Also, swear jar for a couple of "Data studios" :)

- conda
```

- Select **Add** or choose to **Add and start** the studio immediately.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Select **Add** or choose to **Add and start** the studio immediately.
- Select **Add** or choose to **Add and start** the studio session immediately.

```

- Select **Add** or choose to **Add and start** the studio immediately.
- If you chose to **Add** the studio in the preceding step, select **Connect** in the options menu to open the studio in a new browser tab.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- If you chose to **Add** the studio in the preceding step, select **Connect** in the options menu to open the studio in a new browser tab.
- If you chose to **Add** the studio session in the preceding step, select **Connect** in the options menu to open the session in a new browser tab.


- Select **Add** or choose to **Add and start** the studio immediately.
- If you chose to **Add** the studio in the preceding step, select **Connect** in the options menu to open the studio in a new browser tab.
- Once inside the studio, run `code.` to be able to use the clipboard.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Once inside the studio, run `code.` to be able to use the clipboard.
- Once inside the studio, run `code.` to use the clipboard.

- If you chose to **Add** the studio in the preceding step, select **Connect** in the options menu to open the studio in a new browser tab.
- Once inside the studio, run `code.` to be able to use the clipboard.

#### Run nf-core/fetchngs with Conda
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about heading level

nextflow run nf-core/fetchngs -profile test,conda --outdir ./nf-core-fetchngs-conda-out -resume
```

#### Write a Nextflow pipeline with nf-core tools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about heading level

Copy link

@gwright99 gwright99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and commented as requested. Not explicitly asking for changes / approving since I'm tangential to your publication timeline.

toc_max_heading_level: 3
---

[Studios](../data_studios/index.mdx) allows users to host a variety of container images directly in Seqera Platform compute environments for analysis using popular environments including Jupyter (Python) and RStudio notebooks (R), Visual Studio Code IDEs, and Xpra remote desktops. Each studio session provides a dedicated interactive environment that encapsulates the live environment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add hyperlinks? I didn't know what Xpra was when I first heard it being talked about.

:::info[**Prerequisites**]
You will need the following to get started:

- Valid credentials for your cloud storage account and compute environment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to Credentials page in general docs?


### Jupyter: Python-based visualization of protein structure prediction data

Studios and Jupyter notebooks enable interactive analysis using Python libraries and tools. For example, PyMOL is a powerful tool used for visualizing and comparing structures produced by workflows such as [nf-core/proteinfold](https://nf-co.re/proteinfold/1.1.1), a bioinformatics best-practice analysis pipeline for protein 3D structure prediction. This section demonstrates how to create an AWS Batch compute environment, add the nf-core AWS megatests public proteinfold data to your workspace, create a Jupyter studio, and run the provided Python script to produce interactive composite 3D images of the H1065 sequence.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the starting "Studios and". ASAIK 1st paragraph isn't dependent on Studios.


#### Create an AWS Batch compute environment

Studios requires an AWS Batch compute environment. If you do not have an existing compute environment available, create one with the following attributes:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the Forge / Manual CE instructions for those who don't know how to do it?


#### Add data using Data Explorer

For the purposes of this guide, add the proteinfold results (H1065 sequence) from the nf-core AWS megatests S3 bucket to your workspace using Data Explorer:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the prefix or is assumption everyone knows what this is?
Nevermind, I see it's 2 lines down.


#### Perform the analysis and explore results

1. Configure the RStudio environment with installed packages, including [ShinyNGS](https://github.com/pinin4fjords/shinyngs):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no familiarity with R. Leaving this section to Florian's expert eye for review.

```yaml
channels:
- conda-forge
- bioconda

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason why conda is being used over pip here? Worth calling out why?

To use your own data for interactive analysis, see [Add a cloud bucket](./quickstart-demo/add-data.mdx#add-a-cloud-bucket) for instructions to add your own public or private cloud bucket.
:::

#### Create an Xpra studio

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the whole "killer feature" about XPRA the ability for multiple people to collaborate in the container within the same session? If I'm right, this seems like something you'd want to call out beyond the solo interaction steps currently documented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The killer feature of Xpra is also that it's a whole remote desktop with a GUI.

1. Search for PCSK9 and zoom into one of the exons of the gene. A coverage graph and reads should be shown, as below:
![BAM file view](./_images/xpra-data-studios-IGV-view-bam.png)

### VS Code: Create an interactive Nextflow developer environment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there need to be any text here that specifically talks about the Nextflow VS Code plugin?

i.e. "it'll work, you should use it" vs "omg, don't even think about trying that right now"

1. Search for PCSK9 and zoom into one of the exons of the gene. A coverage graph and reads should be shown, as below:
![BAM file view](./_images/xpra-data-studios-IGV-view-bam.png)

### VS Code: Create an interactive Nextflow developer environment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC VS Code has about a bazillion ways to import user preferences / configurations. Does the tutorial need to consider this angle (e.g. if I have a very customized local VS Code for executing my workflows and I now want to port this to a DS session, I assume I'd want my customizations to come with me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants