General
Best Practices
Event Procedure
Helpdesk Request Form
Lab Closure Procedure
Links
Resource Drive Conventions
Resource Map
Shipping (FedEx)
Shipping (International)
Suggestion Box
Support Article Guidelines
Welcome
Information Technology
Email & Calendar
Add Calendars to iPhone
Confidential emails
Create a Shared Calendar
Create an Email Group
Email Filters & Rules
Email Groups
Email Headers
Email Signatures
Google 2-Step Verification
Google Calendar Overview
Phishing
Schedule emails
Staff Minus One Emails
Using Outlook with Gmail
Using Yubico Security Keys with your Google account
Print & Fax
Software
1Password
Adobe Acrobat DC
Adobe Creative Cloud
ArcGIS
Colby VPN
Combine PDFs in Adobe Acrobat
Excel Trust Settings
Install Falcon Antivirus
Microsoft Office
Microsoft Office Runtime Error Fix
Mosyle Mac Enrollment
Raiser's Edge
Slack
Software Resources
Uninstall OpenVPN
Windows 10 - Restore from backup
Updates
2020 December - email name spoofing
2020 October - COVID resources
2021 December - WiFi
2021 January - Zoom recording + private chat
2021 March - NetSuite Google authentication
2021 March - VPN Upgrade
2023 - Zoom Updates
2024 May - VPN SSO
DNS
DNS Change
Data Storage and Computer Backups
Google Drive
HPCC and Storage Proposal Information
Laptop Recommendations
Loaner Hardware
Migrating data from Storage to Google Drive
Passwords
Phones
Restoring Files
Storage
VPN
VPN Migration
Vendor Access
Website Request
WiFi
Zeiss Digital Classroom
HR & Payroll
Paid Time Off
Payroll Overview & FAQ
Personnel Offboarding
Personnel Onboarding
Timesheet Approval (supervisors)
Timesheets
Facilities
BMS Access
Bigelow R/V Billing Form
E&I Wing Construction Update
R/V Bowditch Reservation Center
R/V Clarice Reservation Center
Finance
Admin
Budget & Reports
Invoicing
Policies & Procedures
Advancement Entry of Donations and Pledges
Corporate Traveler / Melon
Gas and Cryo-Supply Ordering Process and Form Link
Purchasing Flowchart - for staff reference
Purchasing Policy
Vendors Exempt from Purchase Orders
Proposals
Purchase & Expense
Bill/Invoice Approval
Creating a Bill to be Paid
Equipment Capitalization Help
Expense Report
Expense Report (example)
Non-Employee Reimbursement
Purchase Order
Purchase Order (example)
Purchase Order (supplemental)
Recurring Purchase Order (SRS)
Amazon.com
Approval Reminders
Business Office Orientation
Capital One - Corporate Credit Card
Customize Dashboard
Dashboard (SRS)
NetSuite FAQ
NetSuite Login
NetSuite shortcuts
Revenue Flow Chart
Workshop, Training Projects, and Participant Support Help
Computing
Software
AAI Calculation
ANI Calculation
AlphaFold
Anvi'o
Conda environments
Jupyter notebook
Prokka
RStudio
dada2
sag-mg-recruit
Job management
Charlie Overview
Connect to Charlie
Edit with VS Code
Getting Started
Monitor jobs
Software modules
Transfer files
Zoom
- Home
- Computing
- Software
- sag-mg-recruit
sag-mg-recruit
Updated
Metagenomic read recruitment workflow developed by the Stepanauskas Group, used in Pachiadaki et al. 2017.
Package github page where extensive instructions can be found: sag-mg-recruit
Available on C1 and C2 via SCGC's anaconda module.
You'll also need to load the dependencies flash and bwa.
To load into environment:
module use /mod/scgc/
module load anaconda
module load flash
module load bwa
For instructions on how to run type: sag-mg-recruit --help
Which should return something like:
Usage: sag-mg-recruit [OPTIONS] INPUT_MG_TABLE INPUT_SAG_TABLE
Options:
--outdir TEXT directory location to place output files
--cores INTEGER number of cores to run on [default: 8]
--mmd FLOAT for join step: mismatch density [default: 0.05]
--mino INTEGER for join step: minimum overlap [default: 35]
--maxo INTEGER for join step: maximum overlap [default: 150]
--minlen INTEGER for alignment and mg read count: minimum alignment
length to include; minimum read size to include
[default: 150]
--pctid INTEGER for alignment: minimum percent identity to keep
within overlapping region [default: 95]
--overlap INTEGER for alignment: percent read that must overlap with
reference sequence to keep [default: 0]
--log TEXT name of log file, else, log sent to standard out
--concatenate BOOLEAN include concatenated SAG in analysis [default: True]
--checkm BOOLEAN should checkm be run on the SAGs? [default: True]
--keep_coverage if you want to keep the genome coverage table (large)
-h, --help Show this message and exit.
Each run requires a table listing input metagenomes and a table listing input SAGs. Example input tables can be found here. Make sure you also specify a new directory for output files using the --outdir parameter.
This workflow is not necessarily optimized for our current HPC environment as it was written pre-scheduler installation. It runs metagenomic read recruitment to SAGs one pair at a time. Good parameters to run this workflow might be 12 - 30 cores and a walltime dependent upon how many metagenomes and sags you are looking to compare as well as the size of your input metagenomes, something between 24 hours and a week.
It's worth noting that this workflow was designed with the recruitment of metagenomic reads generated by Illumina sequencers in mind.