≠paleink (熱門博主)
  • 博客訪問:

周耀旗: 寫好英語科技論文的訣竅

(2007-06-09 19:57:48) 下一個













我的第一篇英語科技論文寫作是把在科大的學士畢業論文翻譯成英文。當我一九九零年從紐約州立大學博士畢業時,發表了20多篇英語論文。但是,我對怎樣寫高質量科技論文的理解仍舊處於初級階段,僅知道盡量減少語法錯誤。之所以如此,是因為大多數時間我都欣然接受我的博士指導老師Dr. George StellDr. Harold Friedman的修改,而不知道為什麽要那樣改,也沒有主動去問。這種情況一直持續到我去北卡州立大學做博士後。我的博士後指導老師Dr. Carol Hall建議我到鄰近的杜克大學去參加一個為期兩天的寫作短訓班。這堂由Gopen教授主辦的短訓班真使我茅塞頓開。第一次,我知道了讀者在閱讀中有他們的期望,要想寫好科技論文,最有效方法是要迎合他們的期望。這堂寫作課幫我成功地完成了我的第一個博後基金申請,有機會進入哈佛大學Dr. Martin Karplus組。在哈佛大學的五年期間,在Karplus教授的指導下,我認識到一篇好的論文需要從深度廣度進行裏裏外外自我審查。目前,我自己當了教授,有了自己的科研組,也常常審稿。我覺得有必要讓我的博士生和博士後學好寫作。我不認為我自己是寫作專家。我的論文也常常因為這樣或那樣的原因被退稿。但是我認為和大家共享我對寫作的理解和我寫作的經驗教訓,也許大家會少走一些我走過的彎路。由於多年未用中文寫作,請大家多多指正。來信請寄: yqzhou@iupui.edu 歡迎訪問我的網站:http://sparks.informatics.iupui.edu
























1. 讀者希望在句子的開始看到熟悉的信息。句子是文章的最小功能單元。最容易理解的句子是整句都在說讀者知道的東西。但這對科技論文是不可能的,因為隻有新的東西才會被發表。事實上科技論文通常會包含很多新術語。所以一個容易理解的句子應該從讀者熟悉的信息(或剛剛提過的)開始而以新信息結束,並在它們之間平滑地過渡。好文章的所有句子都應該這樣從舊到新地平滑過渡。幫助你寫好一句開頭的金科玉律是問問你自己:“我以前有沒有提過這個概念?”大多數文章很難讀是因為很多新概念在沒有被介紹之前就使用了。例如:


Samples for 2-dimensional projection of kinetic trajectories are shown in Figure 7. The coil states are loosely gathered while the native states can form a black cluster with extreme high density in 2-dimensional projection plane.


這裏從第一句到第二句信息無法流動。“The coil states”不知道是從何而來的。讀者會發現下麵改動後的句子更容易明白。


Kinetic trajectories are projected onto xx and yy variables in Figure 7. This figure shows two populated states. One corresponds to loosely gathered coil states while the other is the native state with a high density.


在這個新段裏,新插入的第二句使每句均能從舊信息出發到新信息結束。第一句與第二句之間以“Figure”相連而第二句與第三句之間以“two states”相連。而新信息“coil states”則出現在第三句的最後。整段環環相連,成為一個整體。再看一個例子:


The accuracy of the model structures is given by TM-score. In case of a perfect match to experimental structure, TM-score would be.


在第二個句子裏,舊信息“TM-score”被埋在中間,被新信息“a perfect match to experimental structure”打斷了。這裏建議修改如下:


The accuracy of the model structures is measured by TM-score, which is equal to 1 if there is a perfect match to the experimental structure.


科技寫作中的最大問題就是新舊信息順序顛倒。新信息和舊信息對作者來說可能不是很好區分,因為他非常熟悉所有的信息。 為了避免這種問題,不管什麽時候,每當你開始寫新句,你應該問問自己,這些詞前麵有沒有被提到過。一定要把提到過的放前麵,沒提過的放後麵。


2.. 讀者想在主語之後立刻看到行為動詞。對一個說明誰在做什麽的句子,讀者需要找到動詞才能理解。如果動詞和主語之間相隔太遠,閱讀就會被尋找動詞打斷。而打斷閱讀就會使句子難以理解。這裏有個例子:


The smallest URFs (URFA6L), a 207-nucleotide (nt) reading frame overlapping out of phase the NH2-terminal portion of the adenosinetrip hosphatase (ATPase) subinit 6 gene has been identified as the animal equivalent of the recently discovered yeast H+-ATPase subunit 8 gene.




The smallest of the URFs is URFA6L, a 207-nucleotide (nt) reading frame overlapping out of phase the NH2-terminal portion of the adenosinetriphosphatase (ATPase) subinit 6 Gene; it has been identified as the animal equivalent of the recently discovered yeast H+-ATPase subunit 8 gene.




3. 讀者期望每句隻有一個重點,這個重點通常在句尾。比較下麵兩個句子,我們可以感覺到他們著重強調不同的東西。


URFA6L has been identified as the animal equivalent of the recently discovered yeast H+-ATPasesubunit 8 gene. Recently discovered yeast H+-ATPase subunit 8 gene has a corresponding animal equivalent gene URFA6L.




The enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC) has been determined by direct measurement.


這個句看起來好像是強調“direct measurement”。 這不太像是原作者的目的。顛倒一下會使句子更加平衡。


We have directly measured the enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC).








The enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC) has been determined by direct measurement. dG and dC were derivatized at the 5 and 3 hydroxyls with triisopropylsilyl groups to obtain solubility of the nucleosides in non-aqueous solvents and to prevent the ribose hydroxyls from forming hydrogen bonds. From isoperibolic titration measurements, the enthalpy of dC:dG base pair formation is -6.650.32 kcal/mol.




We have directly measured the enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC). dG and dC were derivatized at the 5 and 3 hydroxyls with triisopropylsilyl groups; these groups serve both to solubilize the nucleosides in non-aqueous solvents and to prevent the ribose hydroxyls from forming hydrogen bonds. The enthalpy of dC:dG base pair formation is -6.650.32 kcal/mol according to isoperibolic titration measurements,


首句描述了整段的主題。原段裏的第一句顛倒是為了1 使新信息“dG”和“dC 在句子最後並強調他們。 2)更好地跟下麵一句銜接。 原段裏的第二句被分成兩部分,這樣每一部分隻表達了一個觀點。最後一句時總結整段。 再看另一個例子:


Large earthquakes along a given fault segment do not occur at random intervals because it takes time to accumulate the strain energy for the rupture. The rates at which tectonic plates move and accumulate strain at their boundaries are approximately uniform. Therefore, in first approximation, one may expect that large ruptures of the same fault segment will occur at approximately constant time intervals. If subsequent main shocks have different amounts of slip across the fault, then the recurrence time may vary, and the basic idea of periodic main shocks must be modified.


在這個例子裏,前兩句共同闡明了積累張力的速度(Rate Of Strain Accumulation)。然而,第一句裏的舊信息並沒有放在第二句的開始。讀者讀到第三句的時候通常就不明白這段到底要講什麽了。更清晰的描述應該如下:


Large earthquakes along a given fault segment do not occur at random intervals because it takes time to accumulate the strain energy for the rupture. The rates of strain accumulation at the boundaries of tectonic plates are approximately uniform. Therefore, nearly constant time intervals (at first approximation) would be expected between large ruptures of the same fault segment. [However?], the recurrence time may vary; the basic idea of periodic main shocks may need to be modified if subsequent main shocks have different amounts of slip across the fault.


新段現在著重闡明了地震的發生頻率。下劃線標明了以前描述過的舊信息。很明顯,新舊信息的連接是理解這段的關鍵。從舊信息到新信息的流動是使讀者輕鬆閱讀的最佳方式。寫文章的目的不是去測試讀者的閱讀能力,而是考驗作者的表達能力。不能怪人沒看懂,隻能怪自己沒寫清楚。常常聽到這樣的抱怨:那審稿人連這都不懂! 審稿人也可以說:連這個也寫不清楚。














1). 隻提出“一”個中心命題。論文裏的觀點太多,不但不好寫,問題也容易多,讀者也不易記住你要說什麽。


2). 在這個中心命題的基礎上,用一個迷人(但決不能誇張)的標題來吸引審稿人的興趣。審稿人隻審批感興趣的論文。如果你不能引起審稿人的興趣,那最好不要發表那篇文章。編輯們有時候會很鬱悶,因為找不到有興趣的審稿人。無償審稿也隻有科學界才有。


3). 合理解釋每一個參數,合理說明每一個步驟。審稿人沒時間考慮細節。程序和參數的合理化顯示出你知道你在做什麽,而不是湊數據。即使你是在湊數據,也要把湊數據的過程合理化。


4). 問問你自己是否提供了足夠重複你工作的所有細節。審稿人(或讀者)越容易再現你的工作,他就越可能接受你的文章。當然,審稿人並不會真正去重做你的工作,但你必須通過你的描述使他相信可以重做。


5). 必須有說服力!盡量做徹底而不是半成品的工作!用多方麵測試來證明你的中心命題。要使文章象律師證明無罪官司,預先回答一切可能提出的疑問。


6). 引用所有重要的研究工作,特別是經典力作。寫作的時候要再做全麵文獻檢索。為了達到這些目標,寫科學論文的時候必須遵照一定的框架結構。










如果文章是關於新的方法,技術,或算法,要非常詳細地寫它的新穎之處。要用有邏輯的、合理的方式來描述它。這會幫助讀者抓住新方法的要領。如果這個方法使用參數,則要把每一個參數(或參數的取值)合理化,或者是以前用過的,或者可以從物理或數學推導出來,或者通過了廣泛的測試及優化。如果無法保證它的合理性,那就必須描述改變它會造成的影響(實際的結果應該在結果部分或討論部分,方法部分僅包含影響的描述)。如果沒有測試它們的合理性,你應該解釋為什麽 (做的代價太貴了?太費時間了?或者需要延期到將來做)。參數改變造成的影響可以衡量方法是否Robust Robust的方法應該是在參數改變很大的時候,結果也不會太大變化。
















當你有了中心命題之後,就該決定文章的標題了。標題可以為你的方法,你的結果或結果的隱含意義做廣告。文章的標題一般隻有一句。應該把最重要,最吸引人的信息放進標題。比如,標題 Steric restrictions in protein folding: an alpha-helix cannot be followed by a contiguous beta-strand 主要突出了結果。另一方麵,標題“Interpreting the folding kinetics of helical proteins 突出了結果的含義。用標題 Native proteins are surface-molten solids: Application of the Lindemann criterion for the solid versus liquid state 的話,同時突出了方法和結果的含義。注意標題 Native proteins are surface-molten solids 是結果的解釋,而不是結果本身。用既廣泛又具體的標題,這樣才能吸引更多的讀者。








Assessing secondary structure assignments of protein structures by using pairwise sequence-alignment benchmarks


The secondary structure of a protein refers to the local conformation of its polypeptide backbone. Knowing secondary structures of proteins is essential for their structure classification1,2, understanding folding dynamics and mechanisms3-5, and discovering conserved structural/functional motifs6,7. Secondary structure informxation is also useful for sequence and multiple sequence alignment8,9, structure alignment10,11, and sequence to structure alignment (or threading)12-15. As a result, predicting secondary structures from protein sequences continues to be an active field of research16-18 fifty six years after Pauling and Corey19-20 first predicted that the most common regular patterns of protein backbones are the α-helix and the β-sheet. Prediction and application of protein secondary structures rely on prior assignment of the secondary-structure elements from a given protein structure by human or computational methods.


Many computational methods have been developed to automate the assignment of secondary structures. Examples are DSSP,STRIDE, DEFINE, P-SEA, KAKSI,P-CURVE, XTLSSTR, SECSTR, SEGNO, and VoTAP. These methods are based on either the hydrogen-bond pattern, geometric features, expert knowledge or their combinations. However, they often disagree on their assignments. For example, disagreement among DSSP, P-CURVE, and DEFINE can be as large as 25%. More beta sheet is assigned by XTLSSTR and more pi-helix by SECSTR than by DSSP. The discrepancy among different methods is caused by non-ideal configurations of helices and sheets. As a result, defining the boundaries between helix, sheet, and coil is problematical and a significant source of discrepancies between different methods.


Inconsistent assignment of secondary structures by different methods highlights the need for a criterion or a benchmark of “standard” assignments that could be used to assess and compare assignment methods. One possibility is to use the secondary structures assigned by the authors who solved the protein structures. STRIDE, in fact, has been optimized to achieve the highest agreement with the authors’ annotations. However, it is not clear what is the criterion used for manual or automatic assignment of secondary structures by different authors. Another possibility is to treat the consensus prediction by several methods as the gold standard. However, there is no obvious reason why each method should weight equally in assigning secondary structures and which method should be used in consensus. Other used criteria include helix-capping propensity, the deviation from ideal helical and sheet configurations, and structural accuracy produced by sequence-to-structure alignment guided by secondary structure assignment.


In this paper, we propose to use sequence-alignment benchmarks for assessing secondary structure assignments. These benchmarks are produced by 3D-structure alignment of structurally homologous proteins. Instead of assessing the accuracy of secondary-structure assignment directly, which is not yet feasible, we compare the two assignments of secondary structures in structurally aligned positions. We assume that the best method should assign the same secondary-structure element to the highest fraction of structurally aligned positions. Certainly, structurally aligned positions do not always have the same secondary structures. Moreover, different structure-alignment methods do not always produce the same result. Nevertheless, this criterion provides a mean to locate a secondary-structure assignment method that is most consistent with tertiary structure alignment. We suggest that this approach provides an objective execuation of secondary structure assignment methods.








One question about the complex homopolymer phase diagram presented here is whether it is caused by the discontinuous feature of the square-well potential. We cannot give a direct answer because the DMD simulation is required to obtain well-converged results for the thermodynamics. However, the critical phenomena predicted for a fluid composed of particles interacting with a square-well potential are as realistic as those predicted for a fluid composed of particles interacting with a LJ potential. Also an analogous complex phase diagram is found in simulations of LJ clusters. The present results for square-well homopolymers may well be found in more realistic homopolymer models and even in real polymers.








How to make an objective assignment of secondary structures based on a protein structure is an unsolved problem. Defining the boundaries between helix, sheet, and coil structures is arbitrary, and commonly accepted standard assignments do not exist. Here, we propose a criterion that assesses secondary-structure assignment based on the similarity of the secondary structures assigned t structurally aligned residues in sequence-alignment benchmarks. This criterion is used to rank six secondary-structure assignment methods: STRIDE, DSSP, SECSTR, KAKSI, P-SEA, and SEGNO with three established sequence-alignment benchmarks (PREFAB, SABmark and SALIGN). STRIDE and KAKSI achieve comparable success rates in assigning the same secondary structure elements to structurally aligned residues in the three benchmarks. Their success rates are between 1-4% higher than those of the other four methods. The consensus of STRIDE, KAKSI, SECSTR, and P-SEA, called SKSP, improves assignments over the best single method in each benchmark by an additional 1%. These results support the usefulness of the sequence alignment benchmarks as the benchmarks for secondary structure assignment.






1. 認真對待寫作。盡你最大努力花時間寫作。它是科學研究的重要一環。文章沒寫好,沒人看,沒人用,等於沒發表。


2. 除非這個研究是全麵徹底的,而且你試了所有可以支持你結論的方法,否則不要去發表。


3. 重新思考,並合理解釋為什麽做這項工作,做了什麽,什麽是最重要的發現?為什麽用這個方法?為什麽用這些參數?什麽是以前做過的(更新文獻搜索)?不同在什麽地方?


4. 要從批判的角度來看你的工作。隻有這樣,才能找到弱點,進一步發展。我的許多論文是在反複討論中大幅度修改,許多計算經常要重做。隻有理順和理解結果,文章才會更有意義。


5. 要能回答所有合理的質疑。如果你自己有疑問,一定要搞清楚,否則別人又怎會相信。


6. 不要隱藏任何事實,不做假,不要低估其他科學家的智慧。讓你的研究可重複。把所有的材料和數據上網。


7. 從頭(標題)到尾(結論或討論)要從舊信息過渡到新信息。永遠不要在句子的開頭引入新信息。切忌在術語被定義之前使用它們。


8. 段首要有闡明整段主題的句子,在段尾要有連到下段的過渡句。從標題到結論都要連貫。句句相扣,段段相連,讓一篇論文是一個整體而不是雜亂無章地把句子堆積在一起。這樣才能使讀者享受閱讀你的文章。


9. 寫,重寫,再重寫。沒有人能第一次就寫好。不花時間,不下功夫,寫不好。我的文章一般要修改十次以上。




此文中的一些例子出自 The Science of Scientific Writing by G. D. Gopen and J. A. Swan, Scientific American, 78, 550-558, 1990. 我在杜克大學Gopen教授1995年年度短訓班受益非淺。我要特別感謝我的導師Martin Karplus(哈佛大學)George Stell (紐約州州立大學-石溪校區), Harold L. Friedman (紐約州州立大學-石溪校區) Carol Hall (北卡羅來納州立大學)的鼓勵和指導。沒有他們,我不會有那麽多機會練習英文寫作。最後,我要感謝我的學生和博士後。他們對科學的貢獻使我可以繼續寫論文,基金申請,或評論。此文中的一部分例子來自與他們合作的文章。此文初稿是用英文寫的。由於我的中文打字速度太慢,特別感謝徐貝思幫我翻譯成中文初稿。如果有不妥的地方是我的問題,請多指教。



[ 打印 ]
閱讀 ()評論 (8)
左轉海螺 回複 悄悄話 受益匪淺,謝謝。
caixia 回複 悄悄話 Very nice article.
Can you send me a copy of this article?
My email is myxpeony@gmail.com
Thanks very much
eachsunnyday 回複 悄悄話 Thanks for this article. Would you please send me a copy of it? My email address: eachsunnyday@yahoo.com.

Have a great day!
dd2008 回複 悄悄話 It is very useful. Could you please send me a copy? My email address: sdong@agcenter.lsu.edu. Thank you very much.
Dong shuanglin
醉清風. 回複 悄悄話 Hi shanshan& Yanmenguan,

I have fowarded this article. Pls check your emails.

Best wishes,

雁門關 回複 悄悄話 很值得收藏的文章,能不能也發給我一個copy? tfwutx@yahoo.com, thanks!
shanshan112 回複 悄悄話 I want to save this article, however, I cannot copy it. Why? Could you send me a electronic version? my email address: jinquanw@udel.edu. Thank you very much!

Jinquan Wu
Blue Monkey 回複 悄悄話 非常有意義的文章.

Blue M