文献翻译-变分贝叶斯独立分量分析

文献翻译-变分贝叶斯独立分量分析
文献翻译-变分贝叶斯独立分量分析

(本科毕业设计论文)

毕业设计(论文)外文资料翻译

作者:

学科专业:

学号:

班级:

指导老师:

2014年6月

变分贝叶斯独立分量分析

摘要

信号的盲分离通过info-max 算法在潜变量模型中被视为最大似然学习潜变量模型。在本文我们提出一个变换方法最大似然学习这些模型,即贝叶斯推理。它已经被证明可以应用贝叶斯推理来确定在主成分分析模型潜在的维度。在本文我们为去除在独立分量分析模型中不必要的来源维度获得类似的方法。我们给一个玩具数据集和一些人为的混合图像提出结果。

1.引言

独立分量分析的目的是为一个基于概率性的独立原件找到一个表示法。实现这样的表示方法是给潜变量是独立约束的潜变量模型拟合一个数据。我们假设一个,有潜在的尺寸W ,观察到的尺寸P 和我们的数据集包含样本n 的模型M 。在ICA 方法中通常把潜在的维度称为“来源”。因此我们为独立生成潜在变量X 寻找模型表示,我们将任何给定的数据点n 带入

∏==

I

i

in n x p x p 1

)()( 假设高斯噪声,观察到的变量的每个实例化的概率,带入

)2

exp(2),,,(2μβ

πβμβ--=

n x n n n W t W x t p 其中W 是PXI 矩阵的参数,B 代表了一种逆噪声方差和u 是一个向量的方法。

1.1源分布

众所周知在独立分量分析,潜在分布的选择是很重要的。特别说明它必须是非高斯。非高斯源分布可以分成两类,那些积极的峰度或“沉重的尾巴”和那些消极的峰度或“光明的尾巴”。前者被称为超高斯分布,后者是亚高斯。如果我

们真正的源分布属于这两个中的任何一个类我们可以尝试分开。对于我们的ICA 模型,我们遵循?(1998)选择超高斯或者是亚高斯灵活的源分布。的运算结果的模型应用于两个可能发生的事。阿蒂亚斯选择了每个因素的混合物M 高斯模型

()

∏∑==??

????=I

i m m ni M

m m n m x

x N

p 121

,)(σπ }{m π是混合系数和每个组件是由一个意思毫米和方差q2m 。

阿蒂亚斯提到作为独立的因子分析模型。我们可能现在写下一个可能性,是一个函数的参数W,β,μ

()()()?∏==x x x t n

n

n

n

N

n d p W p W p μβμβ,,,,,t 1

这个功能现在可以最大化的参数来确定独立的组件。传统的优化执行限制作为B 倾向于零。这种方法由贝尔和介绍了盲源分离作为信息最大化算法。与最大的关系可能是由不同的作者包括卡多佐指出(1997)和麦(1996)。

2.ICA 的贝叶斯形式主义

在本文中我们提出,按照推断模型的参数化的贝叶斯方法,而不是通过最大似然学习的参数。这要求我们把先验对模型参数。我们的目标是如何通过一个特定的选择我们的先验分布的显示P(W)我们可能自动判断哪些已经产生了数据源的数量。我们是主教的贝叶斯PCA (1999年),它的目的是确定在启发我们的方法主要子空间的自动维数。我们选择将噪音精密β,与以前的马,

()()

b ββαββ,gam p =

这里我们定义伽玛分布

()()

()τττ

b a b a a a

b -Γ=

-exp ,gam 1

对于混合矩阵W ,我们认为高斯之前。特别是每一个的相关性输入可通过使用自动相关性确定(ARD )来确定前(尼尔,1996;麦凯,1995年)

()()

∏∏==-=I i P

p i ip N W p 11

1

,0αωα

其中前是由超参数向量管辖,α,长度I 。 参数随着网络的每个输入相关联的向量的一个元素的管辖其决定了它的“相关性”。该超参数α可以通过分层贝叶斯框架来推断。我们因此,把伽玛分布的超高斯通过这些参数,

∏=??? ??=I

i i b a i i gam p 1

)(αααα

最后,我们将一个高斯比之前的手段其

()()∏==P

p p

N

p 1

,0τμμ

中T 的代表事先的逆差额。现在我们可以定义我们的模型可能性。

()()()()?=αβW p x p x W t p M t p ,,

3.变分法

在贝叶斯推理,我们的目标是后验分布为参数。积分在等式10中所示的类型是重要的这一过程。不幸的是,贝叶斯ICA 正如我们所描述的那样,这个积分是棘手的,我们必须寻找到近似取得进展。 我们选择采取变分法(约旦等,1998;劳伦斯,2000)。 变分方法涉及开发一个近似值,q(н),鉴于观察到的变量ν。变推断可以提供严格的下界边缘化数似然的形式,

()()()()

??≥dH H q V H p H q dH V H p ln

,ln

这个结合的和真实的边缘似然之间的差可以被证明,是真正的后验分布之间的距离(KL )散度近似。

()()()()

??≥dH H q V H p H q dH V H p ln

,ln

如果我们利用一个无限制的近似q(н),并执行自由形式的马克西-结合11利润最大化,我们会收回q(н)=p (н/ν)和绑定将成为精确。这种方法是一种期望最大化算法的期望步骤。然而,在我们的模型中,如果这样的选择被提出,不会很容易被解决。相反,通过将限制近似分布的形式,我们希望能尽量减少对KL -分歧的实现。变分的选择q 分布是很重要的我们寻求一个选择是足够简单,让我们的计算死板,但其中给出了足够的灵活性,以使绑定(11)。有各种各样的方法来确定一个有用的近似值。拉帕莱宁(1999),例如,施加于他的变分派特定

的参数设置功能表,然后最小化KL 散度梯度它们的参数的优化。在本文中,我们更愿意考虑我们的近似分布的自由形式的优化正如我们已经提到,如果我们允许后近似完全不受约束的自由形式的优化,我们只会恢复真实后路分布并且很难驾驭。因此,我们必须施加近似的形式约束。考虑一个模型,其中的潜变量,H ,分为专属子集。假如我们要对分离性限制在这些子集上我们逼近后,

()()∏=i

i H q H q

它是直接显示,最佳形式后路分布的各个组成部分是

()

()()

H p H q j H i

j

i q

ln exp ∝∏≠

在这里,我们使用的符号< >q ,表示下分布的期望q ,通过利用穿过模型的参数一个近似值,

()()()()()()μβαμβαμ

βαωq q q q q W x W x q x

=,,,,

我们可能获得的表单模型对数似然的一个下

()()

μβμ

βω,,,ln ln x W t p M t p q q q q x ≥

()()

αα

ωW p x p q q q x

ln ln ++

()()()μβαμ

β

α

p p p q

q

q

ln ln ln +

+

+

()()()()()q q q q q S S S S S

x

μ

β

α

ω

+++++

对于贝叶斯ICA 的模型,正如我们所描述的那样,在等式14所有必要的预期可能进行解析地给出下面的结果

()

()

??

? ??

=∏∑∑===D f

x Z q m n m

n N

n M

n

n m M

x N m m I

,1111π

()∑=μμμμ,m q N

()

()

∏∑==P

p p p

m

w q N

1,ωω

ω

英文论文及中文翻译

International Journal of Minerals, Metallurgy and Materials Volume 17, Number 4, August 2010, Page 500 DOI: 10.1007/s12613-010-0348-y Corresponding author: Zhuan Li E-mail: li_zhuan@https://www.360docs.net/doc/0715550620.html, ? University of Science and Technology Beijing and Springer-Verlag Berlin Heidelberg 2010 Preparation and properties of C/C-SiC brake composites fabricated by warm compacted-in situ reaction Zhuan Li, Peng Xiao, and Xiang Xiong State Key Laboratory of Powder Metallurgy, Central South University, Changsha 410083, China (Received: 12 August 2009; revised: 28 August 2009; accepted: 2 September 2009) Abstract: Carbon fibre reinforced carbon and silicon carbide dual matrix composites (C/C-SiC) were fabricated by the warm compacted-in situ reaction. The microstructure, mechanical properties, tribological properties, and wear mechanism of C/C-SiC composites at different brake speeds were investigated. The results indicate that the composites are composed of 58wt% C, 37wt% SiC, and 5wt% Si. The density and open porosity are 2.0 g·cm–3 and 10%, respectively. The C/C-SiC brake composites exhibit good mechanical properties. The flexural strength can reach up to 160 MPa, and the impact strength can reach 2.5 kJ·m–2. The C/C-SiC brake composites show excellent tribological performances. The friction coefficient is between 0.57 and 0.67 at the brake speeds from 8 to 24 m·s?1. The brake is stable, and the wear rate is less than 2.02×10?6 cm3·J?1. These results show that the C/C-SiC brake composites are the promising candidates for advanced brake and clutch systems. Keywords: C/C-SiC; ceramic matrix composites; tribological properties; microstructure [This work was financially supported by the National High-Tech Research and Development Program of China (No.2006AA03Z560) and the Graduate Degree Thesis Innovation Foundation of Central South University (No.2008yb019).] 温压-原位反应法制备C / C-SiC刹车复合材料的工艺和性能 李专,肖鹏,熊翔 粉末冶金国家重点实验室,中南大学,湖南长沙410083,中国(收稿日期:2009年8月12日修订:2009年8月28日;接受日期:2009年9月2日) 摘要:采用温压?原位反应法制备炭纤维增强炭和碳化硅双基体(C/C-SiC)复合材

关于力的外文文献翻译、中英文翻译、外文翻译

五、外文资料翻译 Stress and Strain 1.Introduction to Mechanics of Materials Mechanics of materials is a branch of applied mechanics that deals with the behavior of solid bodies subjected to various types of loading. It is a field of study that i s known by a variety of names, including “strength of materials” and “mechanics of deformable bodies”. The solid bodies considered in this book include axially-loaded bars, shafts, beams, and columns, as well as structures that are assemblies of these components. Usually the objective of our analysis will be the determination of the stresses, strains, and deformations produced by the loads; if these quantities can be found for all values of load up to the failure load, then we will have obtained a complete picture of the mechanics behavior of the body. Theoretical analyses and experimental results have equally important roles in the study of mechanics of materials . On many occasion we will make logical derivations to obtain formulas and equations for predicting mechanics behavior, but at the same time we must recognize that these formulas cannot be used in a realistic way unless certain properties of the been made in the laboratory. Also , many problems of importance in engineering cannot be handled efficiently by theoretical means, and experimental measurements become a practical necessity. The historical development of mechanics of materials is a fascinating blend of both theory and experiment, with experiments pointing the way to useful results in some instances and with theory doing so in others①. Such famous men as Leonardo da Vinci(1452-1519) and Galileo Galilei (1564-1642) made experiments to adequate to determine the strength of wires , bars , and beams , although they did not develop any adequate theo ries (by today’s standards ) to explain their test results . By contrast , the famous mathematician Leonhard Euler(1707-1783) developed the mathematical theory any of columns and calculated the critical load of a column in 1744 , long before any experimental evidence existed to show the significance of his results ②. Thus , Euler’s theoretical results remained unused for many years, although today they form the basis of column theory. The importance of combining theoretical derivations with experimentally determined properties of materials will be evident theoretical derivations with experimentally determined properties of materials will be evident as we proceed with

文献计量学综述

文献计量学综述 一、起源及发展 早在20世纪初,人们已经开始对文献进行定量化研究,但是当时文献计量学并没有作为一门独立的学科而存在。直到1969年,英国著名情报学家阿伦.普理查德首次提出术语“Bibliometrics”,这一术语的出现标志着文献计量学的正式诞生。 三阶段:萌芽、发展和分化 萌芽(1917-1933)这一时期文献研究人员首创文献统计方法,并在一些学科领域解剖学和化学专业进行了文献计量分析的大胆尝试,取得了一定的成果。这些研究都为文献计量学的诞生与后期的发展奠定了基础 发展(1934- 1960)年注重理论研究与规律发现,著名的文献计量学的三大基本定律中的布拉德福定律以及齐普夫定律就是在这一时期发现的到 成熟与分化阶段全面发展与分化时期(1960年至今) 这一时期文献计量学已由狭隘的理论研究发展到了广阔的应用研究和指标的研究,同时涉及的领域和主题也越来越多。 迁移衍生: 专利计量学 文献计量学网络计量学 政策计量学 二、概念界定 文献计量学是以文献体系和文献计量特征为研究对象,采用数学、统计学等计量研究方法, 研究文献信息的分布结构、数量关系、变化规律和定量管理,并进而探讨科学技术的某些结构、特征和规律的一门学科。可以定量地揭示某一学术领域的发展历程、研究重点以及未来的研究方向。目前,文献计量分析已被看作总结历史研究成果、揭示未来研究趋势的一种重要工具。学科交叉使得文献计量研究内容体系日益丰富。数学中的图论、社会学中的社会网络分析、物理学中的复杂网络等理论与方法均被移植到文献计量学的研究体系中。 三、三大定律 布拉德福定律该定律描述文献分布规律,利用刊载某专业论文的数量来确定该专业的核心期刊,应用于指导文献情报工作和科学评价。 齐普夫定律该定律用以统计文献中的词频,通过文献的词频分析可确定学科或行业的研究热点和研究趋势。 洛特卡定律该定律描述著者人数与所著论文之间的关系。探讨了科学论文著者分布平衡的规律,在宏观的科学著作活动中,少数作者写出了大量文章,大多数人的著作还是很少的。依此定律推论出“杰出科学家数目仅是科学家数目的平方根”。 从表面上三大定律的统计对象各异,其结论也不尽相同,但是它们的研究方法存在着某些相似之处,事实上它们属于同一个分布体系。该体系被称为布-齐-洛体系。如果把期刊、字词、书籍、文章等称为信息发生源,将作品、论文、字词的出现、书籍的使用、文章的被引等称为产物,那么文献计量学的规律可认为是发生源数量与产物数量之间存在的函数关系。

基于划分聚类法的文献综述

基于划分聚类法的文献综述 聚类分析是一种重要的无监替学习方法,作为数据分析的工具,其重要性在各个领域都得到了广泛的认可.聚类分析的目的是寻找数据集中的“口然分组”,即所谓的“簇”.通俗地讲,簇是指相似元素的集合,聚类分析就是一个在数据集中寻找相似元素集合的无监督学习过程.來〔1不同应用领域的数据集具有不同的特点,人们对数据进行聚类分析的目的也不尽相同,聚类分析的方法因数据集而异,因使用目的而异.当前,聚类分析的新方法层岀不穷,纵观各种聚类算法,它们使用的技术互不相同,其理论背景乂彼此交义、重蒂,很难找到一个统一的标准对其进行归类。 聚类分析的方法可分为基于层次的聚类方法、基于划分的聚类方法、基于图论的聚类方法、基于密度和网格的方法等.这些方法虽然从不同角度使用不同的理论方法研究聚类分析,但对于不同的实际问题,聚类分析中的一些基本内容始终是人们关注的焦点。其中,划分法通常是指给定数据库,其中有N个元素,采用分裂法将其构造为K个组,每一个分组就代表一个聚类,K

英文文献及中文翻译

毕业设计说明书 英文文献及中文翻译 学院:专 2011年6月 电子与计算机科学技术软件工程

https://www.360docs.net/doc/0715550620.html, Overview https://www.360docs.net/doc/0715550620.html, is a unified Web development model that includes the services necessary for you to build enterprise-class Web applications with a minimum of https://www.360docs.net/doc/0715550620.html, is part of https://www.360docs.net/doc/0715550620.html, Framework,and when coding https://www.360docs.net/doc/0715550620.html, applications you have access to classes in https://www.360docs.net/doc/0715550620.html, Framework.You can code your applications in any language compatible with the common language runtime(CLR), including Microsoft Visual Basic and C#.These languages enable you to develop https://www.360docs.net/doc/0715550620.html, applications that benefit from the common language runtime,type safety, inheritance,and so on. If you want to try https://www.360docs.net/doc/0715550620.html,,you can install Visual Web Developer Express using the Microsoft Web Platform Installer,which is a free tool that makes it simple to download,install,and service components of the Microsoft Web Platform.These components include Visual Web Developer Express,Internet Information Services (IIS),SQL Server Express,and https://www.360docs.net/doc/0715550620.html, Framework.All of these are tools that you use to create https://www.360docs.net/doc/0715550620.html, Web applications.You can also use the Microsoft Web Platform Installer to install open-source https://www.360docs.net/doc/0715550620.html, and PHP Web applications. Visual Web Developer Visual Web Developer is a full-featured development environment for creating https://www.360docs.net/doc/0715550620.html, Web applications.Visual Web Developer provides an ideal environment in which to build Web sites and then publish them to a hosting https://www.360docs.net/doc/0715550620.html,ing the development tools in Visual Web Developer,you can develop https://www.360docs.net/doc/0715550620.html, Web pages on your own computer.Visual Web Developer includes a local Web server that provides all the features you need to test and debug https://www.360docs.net/doc/0715550620.html, Web pages,without requiring Internet Information Services(IIS)to be installed. Visual Web Developer provides an ideal environment in which to build Web sites and then publish them to a hosting https://www.360docs.net/doc/0715550620.html,ing the development tools in Visual Web Developer,you can develop https://www.360docs.net/doc/0715550620.html, Web pages on your own computer.

平面设计中英文对照外文翻译文献

(文档含英文原文和中文翻译) 中英文翻译 平面设计 任何时期平面设计可以参照一些艺术和专业学科侧重于视觉传达和介绍。采用多种方式相结合,创造和符号,图像和语句创建一个代表性的想法和信息。平面设计师可以使用印刷,视觉艺术和排版技术产生的最终结果。平面设计常常提到的进程,其中沟通是创造和产品设计。 共同使用的平面设计包括杂志,广告,产品包装和网页设计。例如,可能包括产品包装的标志或其他艺术作品,举办文字和纯粹的设计元素,如形状和颜色统一件。组成的一个最重要的特点,尤其是平面设计在使用前现有材料或不同的元素。 平面设计涵盖了人类历史上诸多领域,在此漫长的历史和在相对最近爆炸视觉传达中的第20和21世纪,人们有时是模糊的区别和重叠的广告艺术,平面设计和美术。毕竟,他们有着许多相同的内容,理论,原则,做法和语言,有时同样的客人或客户。广告艺术的最终目标是出售的商品和服务。在平面

设计,“其实质是使以信息,形成以思想,言论和感觉的经验”。 在唐朝( 618-906 )之间的第4和第7世纪的木块被切断打印纺织品和后重现佛典。阿藏印在868是已知最早的印刷书籍。 在19世纪后期欧洲,尤其是在英国,平面设计开始以独立的运动从美术中分离出来。蒙德里安称为父亲的图形设计。他是一个很好的艺术家,但是他在现代广告中利用现代电网系统在广告、印刷和网络布局网格。 于1849年,在大不列颠亨利科尔成为的主要力量之一在设计教育界,该国政府通告设计在杂志设计和制造的重要性。他组织了大型的展览作为庆祝现代工业技术和维多利亚式的设计。 从1892年至1896年威廉?莫里斯凯尔姆斯科特出版社出版的书籍的一些最重要的平面设计产品和工艺美术运动,并提出了一个非常赚钱的商机就是出版伟大文本论的图书并以高价出售给富人。莫里斯证明了市场的存在使平面设计在他们自己拥有的权利,并帮助开拓者从生产和美术分离设计。这历史相对论是,然而,重要的,因为它为第一次重大的反应对于十九世纪的陈旧的平面设计。莫里斯的工作,以及与其他私营新闻运动,直接影响新艺术风格和间接负责20世纪初非专业性平面设计的事态发展。 谁创造了最初的“平面设计”似乎存在争议。这被归因于英国的设计师和大学教授Richard Guyatt,但另一消息来源于20世纪初美国图书设计师William Addison Dwiggins。 伦敦地铁的标志设计是爱德华约翰斯顿于1916年设计的一个经典的现代而且使用了系统字体设计。 在20世纪20年代,苏联的建构主义应用于“智能生产”在不同领域的生产。个性化的运动艺术在俄罗斯大革命是没有价值的,从而走向以创造物体的功利为目的。他们设计的建筑、剧院集、海报、面料、服装、家具、徽标、菜单等。 Jan Tschichold 在他的1928年书中编纂了新的现代印刷原则,他后来否认他在这本书的法西斯主义哲学主张,但它仍然是非常有影响力。 Tschichold ,包豪斯印刷专家如赫伯特拜耳和拉斯洛莫霍伊一纳吉,和El Lissitzky 是平面设计之父都被我们今天所知。 他们首创的生产技术和文体设备,主要用于整个二十世纪。随后的几年看到平面设计在现代风格获得广泛的接受和应用。第二次世界大战结束后,美国经济的建立更需要平面设计,主要是广告和包装等。移居国外的德国包豪斯设计学院于1937年到芝加哥带来了“大规模生产”极简到美国;引发野火的“现代”建筑和设计。值得注意的名称世纪中叶现代设计包括阿德里安Frutiger ,设计师和Frutiger字体大学;保兰德,从20世纪30年代后期,直到他去世于1996年,采取的原则和适用包豪斯他们受欢迎的广告和标志设计,帮助创造一个独特的办法,美国的欧洲简约而成为一个主要的先驱。平面设计称为企业形象;约瑟夫米勒,罗克曼,设计的海报严重尚未获取1950年代和1960年代时代典型。 从道路标志到技术图表,从备忘录到参考手册,增强了平面设计的知识转让。可读性增强了文字的视觉效果。 设计还可以通过理念或有效的视觉传播帮助销售产品。将它应用到产品和公司识别系统的要素像标志、颜色和文字。连同这些被定义为品牌。品牌已日益成为重要的提供的服务范围,许多平面设计师,企业形象和条件往往是同时交替使用。

10kV小区供配电英文文献及中文翻译

在广州甚至广东的住宅小区电气设计中,一般都会涉及到小区的高低压供配电系统的设计.如10kV高压配电系统图,低压配电系统图等等图纸一大堆.然而在真正实施过程中,供电部门(尤其是供电公司指定的所谓电力设计小公司)根本将这些图纸作为一回事,按其电脑里原有的电子档图纸将数据稍作改动以及断路器按其所好换个厂家名称便美其名曰设计(可笑不?),拿出来的图纸根本无法满足电气设计的设计意图,致使严重存在以下问题:(也不知道是职业道德问题还是根本一窍不通) 1.跟原设计的电气系统货不对板,存在与低压开关柜后出线回路严重冲突,对实际施工造成严重阻碍,经常要求设计单位改动原有电气系统图才能满足它的要求(垄断的没话说). 2.对消防负荷和非消防负荷的供电(主要在高层建筑里)应严格分回路(从母线段)都不清楚,将消防负荷和非消防负荷按一个回路出线(尤其是将电梯和消防电梯,地下室的动力合在一起等等,有的甚至将楼顶消防风机和梯间照明合在一个回路,以一个表计量). 3.系统接地保护接地型式由原设计的TN-S系统竟曲解成"TN-S-C-S"系统(室内的还需要做TN-C,好玩吧?),严格的按照所谓的"三相四线制"再做重复接地来实施,导致后续施工中存在重复浪费资源以及安全隐患等等问题.. ............................(违反建筑电气设计规范等等问题实在不好意思一一例举,给那帮人留点混饭吃的面子算了) 总之吧,在通过图纸审查后的电气设计图纸在这帮人的眼里根本不知何物,经常是完工后的高低压供配电系统已是面目全非了,能有百分之五十的保留已经是谢天谢地了. 所以.我觉得:住宅建筑电气设计,让供电部门走!大不了留点位置,让他供几个必需回路的电,爱怎么折腾让他自个怎么折腾去.. Guangzhou, Guangdong, even in the electrical design of residential quarters, generally involving high-low cell power supply system design. 10kV power distribution systems, such as maps, drawings, etc. low-voltage distribution system map a lot. But in the real implementation of the process, the power sector (especially the so-called power supply design company appointed a small company) did these drawings for one thing, according to computer drawings of the original electronic file data to make a little change, and circuit breakers by their the name of another manufacturer will be sounding good design (ridiculously?), drawing out the design simply can not meet the electrical design intent, resulting in a serious following problems: (do not know or not know nothing about ethical issues) 1. With the original design of the electrical system not meeting board, the existence and low voltage switchgear circuit after qualifying serious conflicts seriously hinder the actual construction, often require changes to the original design unit plans to meet its electrical system requirements (monopoly impress ). 2. On the fire load and fire load of non-supply (mainly in high-rise building in) should be strictly sub-loop (from the bus segment) are not clear, the fire load and fire load of non-qualifying press of a circuit (especially the elevator and fire elevator, basement, etc.

聚类分析外文文献及翻译

本科毕业论文 外文文献及译文 文献、资料题目:Cluster Analysis —Basic Concepts and Algorithms 文献、资料来源:https://www.360docs.net/doc/0715550620.html, 文献、资料发表(出版)日期: 院(部):土木工程学院 专业:土木工程 班级: 姓名: 学号: 指导教师: 翻译日期:

外文文献: Cluster Analysis —Basic Concepts and Algorithms Cluster analysis divides data into groups (clusters) that are meaningful, useful,or both. If meaningful groups are the goal, then the clusters should capture the natural structure of the data. In some cases, however, cluster analysis is only a useful starting point for other purposes, such as data summarization. Whether for understanding or utility, cluster analysis has long played an important role in a wide variety of ?elds: psychology and other social sciences, biology,statistics, pattern recognition, information retrieval, machine learning, and data mining. There have been many applications of cluster analysis to practical problems. We provid e some speci?c examples, organized by whether the purpose of the clustering is understanding or utility. Clustering for Understanding Classes, or conceptually meaningful groups of objects that share common characteristics, play an important role in how people analyze and describe the world. Indeed, human beings are skilled at dividing objects into groups (clustering) and assigning particular objects to these groups (classi?cation). For example, even relatively young children can quickly label the objects in a photograph as buildings, vehicles, people, animals, plants, etc. In the context of understanding data, clusters are potential classes and cluster analysis is the study of techniques for automatically ?nding classes. The following are some examples: Biology.Biologists have spent many years creating a taxonomy (hierarchical classi?cation) of all living things: kingdom, phylum, class,order, family, genus, and species. Thus, it is perhaps not surprising that much of the early work in cluster analys is sought to create a discipline of mathematical taxonomy that could automatically ?nd such classi?cation structures. More recently, biologists have applied clustering to analyze the large amounts of genetic information that are now available. For example, clustering has been used to ?nd groups of genes that have similar functions. ? Information Retrieval. The World Wide Web consists of billions of Web pages, and

中文和英文简历和专业英语材料翻译

韶关学院 期末考核报告 科目:专业英语 学生姓名: 学号: 同组人: 院系: 专业班级: 考核时间:2012年10月9日—2012年11月1 日评阅教师: 评分:

第1章英文阅读材料翻译 (1) 第2章中文摘要翻译英文 (3) 第3章中文简历和英文简历 (4) 第4章课程学习体会和建议 (6) 参考文献 (7)

第1章英文阅读材料翻译 Mechanization and Automation Processes of mechanization have been developing and becoming more complex ever since the beginning of the Industrial Revolution at the end of the 18th century. The current developments of automatic processes are, however, different from the old ones. The “automation” of the 20th century is distinct from the mechanization of the 18th and 19th centuries inasmuch as mechanization was applied to individual operations, wherea s “automation” is concerned with the operation and control of a complete producing unit. And in many, though not all, instances the element of control is so great that whereas mechanization displaces muscle, “automation”displaces brain as well. The distinction between the mechanization of the past and what is happening now is, however, not a sharp one. At one extreme we have the electronic computer with its quite remarkable capacity for discrimination and control, while at the other end of the scale are “ transfer machines” , as they are now called, which may be as simple as a conveyor belt to another. An automatic mechanism is one which has a capacity for self-regulation; that is, it can regulate or control the system or process without the need for constant human attention or adjustment. Now people often talk about “feedback” as begin an essential factor of the new industrial techniques, upon which is base an automatic self-regulating system and by virtue of which any deviation in the system from desired condition can be detected, measured, reported and corrected. when “feedback” is applied to the process by which a large digital computer runs at the immense speed through a long series of sums, constantly rejecting the answers until it finds one to fit a complex set of facts which have been put to it, it is perhaps different in degree from what we have previously been accustomed to machines. But “feedback”, as such, is a familiar mechanical conception. The old-fashioned steam engine was fitted with a centrifugal governor, two balls on levers spinning round and round an upright shaft. If the steam pressure rose and the engine started to go too fast, the increased speed of the spinning governor caused it to rise up the vertical rod and shut down a valve. This cut off some of the steam and thus the engine brought itself back to its proper speed. The mechanization, which was introduced with the Industrial Revolution, because it was limited to individual processes, required the employment of human labor to control each machine as well as to load and unload materials and transfer them from one place to another. Only in a few instances were processes automatically linked together and was production organized as a continuous flow. In general, however, although modern industry has been highly mechanized ever since the 1920s, the mechanized parts have not as a rule been linked together. Electric-light bulbs, bottles and the components of innumerable mass-produced

聚类分析文献英文翻译

电气信息工程学院 外文翻译 英文名称:Data mining-clustering 译文名称:数据挖掘—聚类分析 专业:自动化 姓名:**** 班级学号:**** 指导教师:****** 译文出处:Data mining:Ian H.Witten, Eibe Frank 著 二○一○年四月二十六日

Clustering 5.1 INTRODUCTION Clustering is similar to classification in that data are grouped. However, unlike classification, the groups are not predefined. Instead, the grouping is accomplished by finding similarities between data according to characteristics found in the actual data. The groups are called clusters. Some authors view clustering as a special type of classification. In this text, however, we follow a more conventional view in that the two are different. Many definitions for clusters have been proposed: ●Set of like elements. Elements from different clusters are not alike. ●The distance between points in a cluster is less than the distance between a point in the cluster and any point outside it. A term similar to clustering is database segmentation, where like tuple (record) in a database are grouped together. This is done to partition or segment the database into components that then give the user a more general view of the data. In this case text, we do not differentiate between segmentation and clustering. A simple example of clustering is found in Example 5.1. This example illustrates the fact that that determining how to do the clustering is not straightforward. As illustrated in Figure 5.1, a given set of data may be clustered on different attributes. Here a group of homes in a geographic area is shown. The first floor type of clustering is based on the location of the home. Homes that are geographically close to each other are clustered together. In the second clustering, homes are grouped based on the size of the house. Clustering has been used in many application domains, including biology, medicine, anthropology, marketing, and economics. Clustering applications include plant and animal classification, disease classification, image processing, pattern recognition, and document retrieval. One of the first domains in which clustering was used was biological taxonomy. Recent uses include examining Web log data to detect usage patterns. When clustering is applied to a real-world database, many interesting problems occur: ●Outlier handling is difficult. Here the elements do not naturally fall into any cluster. They can be viewed as solitary clusters. However, if a clustering algorithm attempts to find larger clusters, these outliers will be forced to be placed in some cluster. This process may result in the creation

相关文档
最新文档