This tutorial uses a the 10x Genomics Cloud Analysis platform to preform the raw data processing for a set of human PBMCs with the intent to explore neutrophil maturation. The goals are to demonstrate the following:
- Analysis begins with design: the goals of the experiment will determine how the data should be processed and analyzed.
- Signing in and creating a project
- Using the web based FASTQ uploader
- Creating a new analysis
- Downloading output files
Note: these instructions are preforming double duty as a stand alone resource and a workshop module. Workshop participants should have already created accounts and sent us their email addresses, so we can transfer a completed project to them prior to the workshop. This completed project will get around having to upload a full-sized set of FASTQs during the workshop.
1. Sign in and create a project
1.1. Navigate to the 10x Genomics Cloud Analysis webpage, by going to 10x Genomics.com. Hover over the "Products" tab (A in the Figure below). Click on "Cloud Analysis" (B).
1.2. Click the button to move to the sign up /sign in page. Here is a direct link to the sign in page: https://cloud.10xgenomics.com/signin.
1.3. After you have signed in, the first thing you are going to see is the 10x Cloud projects page. Here you can see that I have one project that already exists (A in the Figure below). If you are participating in this workshop live you will have an existing project as well. We will come back to this later. Now let’s create a new project by clicking on "Create New Project." There are two places you can click to create a new project (B).
1.4. We can now name and describe our new project. Try to name your projects and provide descriptions that will make sense to you, and others, months from now. A good description can help not only you, but others that may be working on your project. The description is optional, but recommended.
1.5. After you click "Create Project" you will be taken to your project. On this page you can see the name of your project, the description, and the Project ID assigned to your project by the system. From here you are set up to upload your FASTQ files.
2. Upload your input data
2.1. After you create a project it will take you right to the next step. To upload your input FASTQs, you can use the 10x Cloud web uploader or the command line uploader. For today, let’s use the web uploader with a set of tiny FASTQ files. If you are participating in this workshop live, you were sent the Tiny FASTQ set before the workshop. If you are following this workshop on your own (or missed the email) you can download the tiny FASTQs here. If I open the directory that contains the FASTQ files in finder this is what I see.
2.2. We have two sets of reads. Each set has three files. The R1 and R2 files contain the paired sequencing reads. The I1 files have the sequencing indices and are optional in our three prime gene expression assay. I’m showing them here because most sequencing providers deliver the I files along with R1 and R2 files. These files need to follow specific naming convention:
2.3. Click on "Browse Files" to use the web uploader to upload our tiny FASTQ files.
2.4. This will open your computer's file browser. Navigate to the directory that contains your FASTQ files. Drag it to your project window (A in the Figure below). This queues the FASTQ files for upload. Click "Cancel" on the file browser to return to your project (B).
2.5. Click "Start Upload" to start the upload.
2.6. The upload manager opens, and we can monitor the progress of the upload from here. After a bit you will get an estimate for how long the data upload is going to take.
2.7. Once complete you will get an "Upload Successfully Completed" message. Then you can close the upload manager.
2.8. The FASTQs are now visible in your project. Based on the file naming, 10x Cloud Analysis will group sets of FASTQ files together. We can see that each set has an I1, R1, and R2 file associated with it. Now, you are ready to move on to analysis.
Note: If you would like a command line option for uploading FASTQ files, you can find detailed instructions here.
3. Create a new analysis
3.1. The Tiny FASTQ data set is great for learning to use the uploader, but it isn’t very useful for doing analysis. So let’s move over to the Neutrophils workshop project by clicking on it. Note: if you are working through this on your own, you can find the input FASTQ files for this data set here.
3.2. In this project we have two sets of FASTQ files as well. We also have an I2 file in these sets because the sequencing library and run was dual indexed.
3.3. Create a new analysis by A. Selecting your FASTQs and assigning a library type to each set. We are including both sets of FASTQ in our analysis and the library type is "Single Cell 3' Gene Expression." B. If we had just uploaded these reads we would have to set library type before we could create the new analysis. Now C. Enter the analysis wizard by clicking on the "Create New Analysis" button.
3.4. The analysis wizard shows you the FASTQs you are analyzing and selects a pipeline based on the library type you assigned to the FASTQ files. The Run Analysis button will remain inactive until we have filled in the minimum required parameters.
3.5. We can now add a name and a description to our analysis. Just like the project name and description, try to make these meaningful.
3.6. The only required field left is the Transcriptome reference. These are human PBMCs, so we will select the most recent human reference from the dropdown. At this point the "Run Analysis" button becomes active and we could run this analysis, but we are not quite ready for that yet.
3.7. Neutrophils typically express only a few hundred genes by default. To ensure that we identify these cells in our analysis we can: A. Turn on "Include Intronic Reads", and B. in advanced settings, set "Force Cells" to 8000. C. Now we are ready to click Run Analysis.
3.8. We are now presented with a screen that lets us know our analysis has started and that we will be notified by email once it is complete. Now we can click "Back to Project".
3.9. In our project we see our new analysis. A. The status is "In progress". B. We can see it has been running for 1 minute.
4. Output files
4.1. From here you can either wait for the run to complete or, if you are in the live workshop, you can have a look at the preloaded run that used the same data and parameters as the analysis we just started. Either way click on the analysis to see the results.
4.2. There is a lot of information on this page. A. By clicking the “View all settings” dropdown you can see all the parameters used in this run. B This button is for viewing the web summary. We will spend some time on the web summary a little later.
4.3. Scrolling down takes us to the output files where we can find a number of different file types. After we do a bit of QA with the web summary we will use the Loupe file that you can see here. A. While we are here, we might as well start downloading the Loupe file by checking the box next to the file and B. clicking the download button. C. We can also see on this page that we have a number of days left of free storage for this analysis. For more information 10x Genomics Cloud Analysis please visit our website here. If you have questions about 10X Genomics Cloud Analysis, have an issue with your run, or want to provide feedback D. these three links are helpful for finding documentation, creating a ticket with 10x Genomics support, and/or sending in your feedback.
This concludes the raw data processing section of our workshop. Up next we have Quality Assessment using the Cell Ranger Web summary.