0%

Hi,welcome to my website. This site is under construction. The contents are maintained in zxzyl.com. Click link to jump to that page. Thank you!

Since March 2015, I will use this WordPress site as my main blog. Hope more articles.

RNA测序可以检测人类和其他生物的基因表达情况。最近这一方法在生物科学和医学研究中非常流行,而且正在逐渐走向临床应用。与之前的方法相比,RNA测序的优势是便于研究选择性剪切形成的基因异构体或转录本。

那么RNA测序到底可不可靠呢?日前,由美国FDA牵头的测序质量控制(SEQC)项目对RNA测序的准确性、可重现性和信息含量进行了综合性评估,并将初步调查结果发表在近日的Nature Biotechnology杂志上。

研究团队使用RNA参照样本,在全球多个实验室的Illumina HiSeq、Life Technologies SOLiD、Roche 454平台上进行了检测。(深圳华大基因、复旦大学、华东师范大学等单位参与了这一项目。)研究人员主要是评估RNA测序在接头区域和差异性表达谱中的表现,并将其与芯片和定量PCR(qPCR)进行比较。

研究人员发现,所有测序深度都会出现未注释的外显子-外显子连接区域,其中80%以上都得到了qPCR的验证。用RNA测序检测相对表达可以得到准确且可重复的结果,但RNA测序和芯片都不能提供精确的绝对测量,而且研究用到的平台都存在基因特异性的偏好,包括qPCR。

数据分析的算法也会对RNA测序产生很大影响,不同算法生成的转录本数据差异很大。研究显示,赫尔辛基大学和曼彻斯特大学开发的BitSeq能生成最可靠的结果,这一方法以概率建模为基础。

这项研究获得的完整SEQC数据集拥有超过10Tb读取,为评估RNA测序分析提供了宝贵的资源。

转自 http://www.biodiscover.com/news/research/112652.html

原文original paper: http://www.nature.com/nbt/journal/v32/n9/full/nbt.2957.html

We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.

目前,我遇到过两种JAVA block的情况.
一种是在建立http流之后,用conn.getInputStream().read()的时候block掉,这种情况通常是流再打开之后,网络出问题或者对方服务器问题等等。

通常写法是

while((is.read(buffer))!=-1){
do something
}

网上通常的解决办法是用socket 来看是否超时,我的解决办法是
循环不要以is.read(buffer)作为判断语句。而是用is.available()做判断,如果availab总是返回0值,那么退出重连,这其实相当于自己判断是否block掉了。

另外一个就是在执行Runtime.getRuntime().exec(script)的时候,由于script报错太多,没有即使读取出来,导致java被block掉。

把网上的解决方案给大家。转自http://saluya.iteye.com/blog/1260347

Read more »

Mac下的Endnote从X6换成X7后,word中的插入工具还是X6,插入文献时提示”Error while reading serialized data”

After upgrading the EndNote software, Word still shows older version of EndNote tools and displays an “Error while reading serialized data”. This problem may happen if the older version of EndNote was not uninstalled properly using Customizer and that the older version CWYW tools are loading in Word.
1 Make sure to close all Office programs and then open your hard drive.
2 Navigate to your Word startup folder. This path is usually [Applications: Microsoft Office 2011: Office: Startup: Word].
3 Take any files from this ‘Word’ folder and drag them out to the Desktop.
4 Open EndNote.
5 In EndNote, click on the EndNote menu and choose “Customizer…”
6 On the Customizer window, make sure “Cite While You Write” is checked.
7 Click the “Next” button twice and then press “Done” to close the Customizer window.
8 Open Word again to see if the latest EndNote tools are now loading.

Read more »

微信群里一姐们,说自己马上要毕业了。昨儿跟自己的好姐妹去夜店里蹦跶,然后半夜在马路上边哭边喊,于是她今天的嗓子哑的和杨坤似的。

想起我毕业的时候倒是风平浪静,啥疯狂的事儿没干。跟兄弟喝酒的时候一直很正常,感觉仿佛毕业只是一个再常见不过的程序,末了我一个人收拾行李的时候,听着yellow,突然间就跟傻逼一样地哭起来。我一直是个钝感严重的傻缺,大概直到那个时候,我才明白自己要告别的是什么。

告别。

尽管我们都在彼此的同学录里写着”友谊常在”之类的字眼–也不知道现在是不是还流行着同学录这样的东西,还是现在早已互留人人微博–但还是莫名其妙地失联。曾经的人人热闹的景象也不见了,取而代之的是一片沉默。

倒不是不想去联系,只是怕联系的时候只剩下一句:”好久不见。” “最近还不错。”便无话可说。谁都害怕曾经的友谊变得如此似是而非,所以干脆不联系。也有因为逐渐开始走向各自的生活轨迹,偶然想起的时候,只是害怕打扰。

六点起床只为了见她一面的那个姑娘;晚上熬夜在楼下一起抽烟的死党连同他欠我的那顿饭;失恋的时候陪我很久又突然失联的姑娘;散伙饭上抱着哭的哥们。

后来就真的再也没见过。

Read more »

可以编辑PDF中的文字比如删除,替换,加入。注意是pdf中的文字,不是图片。

1选中想要编辑的区域

2选择correct text

就可以想写txt一样编辑啦

主要功能:

替换原始 PDF 中的文本块移动,调整大小,拷贝和删除原始 PDF 中的图像

覆盖文本和图像到 PDF 上 (比如将签名图像标记到上面

在扫描文稿上执行光学文字识别 (OCR) 拷贝和粘贴富文本,从 pdf 拷贝时保留字体和格式

用高亮,下划线和删除线标记文本 在资料库存储常用的图像,签名,对象和文本

创建和编辑目录

使用 AppleScript 进行 pdf 维护操作

支持各种图像格式,支持并可编辑word文档,包括rtf,doc和docx格式。

下载: http://www.cr173.com/soft/35642.html http://bbs.weiphone.com/read-htm-tid-415620.html

A scaffold is a portion of the genome sequence reconstructed from end-sequenced whole-genome shotgun clones. Scaffolds are composed of contigs and gaps. A contig is a contiguous length of genomic sequence in which the order of bases is known to a high confidence level. Gaps occur where reads from the two sequenced ends of at least one fragment overlap with other reads in two different contigs (as long as the arrangement is otherwise consistent with the contigs being adjacent). Since the lengths of the fragments are roughly known, the number of bases between contigs can be estimated.

The goal of whole-genome shotgun assembly is to represent each genomic sequence in one scaffold; however, this is not always possible. One chromosome may be represented by many scaffolds (e.g., Chlamydomonas reinhardtii) or just a single scaffold (e.g., Human chromosome 19), depending on how completely the genome can be reconstructed, or assembled, from the available reads. The relative locations of scaffolds in the genome are unknown.

Scaffolds are normally numbered approximately from largest to smallest. Some scaffolds may ultimately be filtered out of the assembly, resulting in skipped scaffold numbers. In some cases, scaffolds can overlap. For example, in polymorphic genomes, regions with a high density of allelic differences between haplotypes may be split into separate sets of scaffolds, each representing one allele. Thus, a sequence that exists in only one location in the genome may appear on more than one scaffold.

Gaps are shown in the Genome Viewer as red lines or rectangles in the scaffold track (viewed in “full” mode). Contigs are shown in black. In FASTA sequences, gaps are represented by a series of Ns.

In computational biology, the N50 statistic is a statistic of a set of contig lengths. The N50 is similar to a mean or median, but has greater weight given to the longer contigs. It is used widely in genome assembly, especially in reference to contig lengths within a draft assembly. Given a set of contigs, each with its own length, the N50 length is defined as the length for which the collection of all contigs of that length or longer contains at least half of the total of the lengths of the contigs, and for which the collection of all contigs of that length or shorter contains at least half of the total of the lengths of the contigs. (When more than one value of length meets both these criteria then the N50 is the average of the longest and shortest lengths that meet these criteria.) This can be thought of as the point of half of the mass of the distribution; the number of bases from all contigs shorter than the N50 will be close to equal to the number of bases from all contigs longer than the N50. The N90 statistic is smaller than or equal to the N50 statistic; it is the length for which the collection of all contigs of that length or longer contains at least 90% of the total of the lengths of the contigs, and for which the collection of all contigs of that length or shorter contains at least 10% of the total of the lengths of the contigs.

Read more »

Somehow, the only way to use LibSVM with Weka is by using the bash command-line.
I have tried the second method successfully. As for you , etheir is good.

Step 1: Get Weka.Assume the bleeding edge version 3.7.0. Unzip and put in /Applications folder.

Step 2: Get LibSVM.
a. Iowa State site ():
If you use Safari to download, it will be unzipped in the Downloads directory. The files you need are /Downloads/WLSVM/lib/wlsvm.jar and /Downloads/WLSVM/lib/libsvm.jar
b. Taiwan site([](http://www.csie.ntu.edu.tw/
cjlin/libsvm/libsvm-2.89.zip “http://www.csie.ntu.edu.tw/~cjlin/libsvm/libsvm-2.89.zip")):
Using Safari to download, the file you need is ~/Downloads/libsvm-2.89/java/libsvm.jar

Step 3: Open Terminal and copy to Weka.app. Assume you have privileges to write into Weka.app.
a. Iowa State version:
$ cp ~/Downloads/WLSVM/lib/*.jar /Applications/Weka/weka-3-7-0.app/Contents/Resources/Java
b. Taiwan version:
$ cp ~/Downloads/libsvm-2.89/java/libsvm.jar /Applications/Weka/weka-3-7-0.app/Contents/Resources/Java

Step 4: Set CLASSPATH
$ export CLASSPATH=$CLASSPATH:/Applications/Weka/weka-3-7-0.app/Contents/Resources/Java/
Step 5:Run Weka from Terminal!
$ java -classpath $CLASSPATH:weka.jar:libsvm.jar weka.gui.GUIChooser &

Another method below:
1.copy the libsvm.jar and wvsvm.jar into the app folder like told in the blog

2.edit weka.appinfo.plist file in textedit

After ClassPath
and the array start tag you’ll see
$JAVAROOT/weka.jar within string tags
copy that whole line and make two copies directly below it
edit the new copies to say
$JAVAROOT/libsvm.jar in one
and
$JAVAROOT/wlsvm.jar in the other