Pentaho+Metadata+Editor

Pentaho+Metadata+Editor
Pentaho+Metadata+Editor

00. 术语元数据(Metadata Terminology)

z公共数据仓库模型CWM? (Common Warehouse Metamodel)

The Common Warehouse Metamodel (CWM?) is a specification that describes metadata interchange among data warehousing, business intelligence, knowledge management and portal technologies. The Pentaho metadata model is in compliance with the CWM? specification. For more information, please

see the CWM? Resource Page.

公共数据仓库模型是一种规范标准,限定了数据仓库、商业智能、知识管理、端口(portal)技术之间交换的元数据的格式。Pentaho元数据模型符合公共数据仓库模型标准,需要更多信息,请参照CWM? Resource Page.

z域(Domain)

A metadata Domain is a Pentaho term representing all of the business objects created, stored and used in the metadata layer. A domain may consist of one or more connections, one or more models, security information, business tables, business views, categories, columns and concepts. You can create and save multiple Pentaho metadata domains using the Pentaho Metadata Editor.

A metadata domain is accessed by a Pentaho BI server by exporting (or publishing) the domain to a file named metadata.xmi, and placing the file in the Pentaho solutions repository at the root of a

solution's directory structure.

元数据域是pentaho的一个词,代表了在元数据层创建、存储、使用的业务对象。一个域可以是一个或多个的连接、模型、安全信息,业务视图、目录(categories)、表中的列以及概念(concepts)构成。你可以用pentaho元数据编辑器创建、保存多个pentaho元数据域。

Pentaho BI服务器通过一个metadata.xmi文件访问一个元数据域,这个元数据域文件被放在pentaho solutions 资源库的根目录。

z连接(Connection)

A Connection represents one database's connection information, and acts as the parent in the hierarchy for all physical tables and physical columns that are defined for that database. Pentaho metadata models can connect to most common relational databases using JDBC.

连接表示了和数据库的连接信息,处在数据库中所有实体表、实体列的最上层。Pentaho 元数据模型能够通过jdbc和几乎所有的关系数据库相连。

z实体表(Physical Table)

A Physical Table is an object that represents table information mapped directly to a table from your database (or connection).

实体表是一个和数据库表直接映射的表信息对象。

z实体列(Physical Column)

A Physical Column is an object that represents column information mapped directly to a column from a table in your database (or connection).

实体列是一个和你的数据库表的列直接映射的列信息对象。

z业务模型(Business Model)

A Business Model contains all of the logical, abstract business objects and relationships that model your physical database objects in such a way that underlying changes to the physical database objects have minimal impact on your business, your business intelligence application, and your end users. There can be multiple business models in a single domain. A business model currently supports just one database connection, and consists of business tables, relationships, and a single business view.

一个业务模型包括所有的逻辑、抽象业务对象和通过一定方式形成的实体数据库对象之间的关系。那种方式保证了实体数据库对象可能的变更对你的业务、业务程序、最终用户影响最小。在一个单个的域中能有多个业务模型。一个业务模型目前只支持一个数据库连接,它由业务表、关系、和一个业务视图构成。

z业务表(Business Table)

A Business Table is the logical representation of a physical table. This is the layer in the model that insulates your applications and users from database changes. The business table models your physical table, so changes to the database only require changes to this layer in the metadata model, and NOT to the hundreds of reports, dashboards, transformations, etc. your users have created against this

data.

业务表是实体表的逻辑表示。这些业务表模型形成的层次把你的应用、用户和数据库的变化隔离开了。这些业务表模仿你的实体表,所以当数据库改变时,你只需要业务层次上的元数据模型,而不用去更改数以百计的那些当初用户依赖数据创建的报表、仪表盘、转换等

z业务列(Business Column)

A Business Column is the logical representation of a physical column. See Business Table for

further description.

Note that a business column can have a business table as its parent, or a business category as its parent. They represent the same physical column regardless of which type of parent they have, but serve different purposes. A business column as child of a business table is modeling the physical column, and plays a part in describing the relationships between the business tables (think primary / foreign key relationships). A business column as a child of a business category is representing a column for applications and users to consume, and as such may have a richer set of properties.

业务列是实体烈的逻辑表示。更多细节参照业务表。

需要注意的是业务列的“父亲”可以是业务表也可以是业务类。尽管他们的“父亲”不同,他们表示的依然是相同的实体列,但这样做有不同的目的,如果业务列是业务表的孩子,它有点像模仿实体列的角色,起的作用是展示和业务表的关系(比如主键和外键);如果业务列是业务类的孩子,它代表应用程序、用户、以及一些属性集合的用到的一列。

z业务关系(Business Relationship)

A Business Relationship describes a relationship between two tables. This means determining

key column matches between two tables, as well as the 1:1, 1:N, N:N relationship.

Business relationships need to be defined for all modeled tables that have such relationships.

业务关系描述两个表之间的关系,这意味着关键键值列在两表之间匹配,如1:1, 1:N, N:N关系,对所有有这种关系的模型表业务关系都需要定义。

z业务类(Business Category)

A Business Category is a bucket to hold a logical grouping of business columns. Note that the business columns added to a particular business category can come from multiple, different business tables. A business category does not have any relationship to the business tables defined in the model, even though you can create a business category from a business table (a convenience feature). Business categories are independent, relatively featureless buckets used primarily for organizing your business columns in a fashion that is more intuitive to your users.

业务类表示的业务列的逻辑分组,需要注意的是向某个业务类中添加的业务列可以来自多个不同的业务表。即使你可以从一个业务表创建一个业务类(一种常见的例子),但一个业务类和业务表没有任何关系。业务类主要是用来组织那些让你用户觉得有些勉强的、联系不紧的业务列,它是独立的、没有显著特征。

z业务视图(Business View)

The Business View is a collection of business categories that represents the "view" of your model, typically consumed by your end users. Each model can have one and only one business view. Business views are made up of a logically (logically relevant to your organization or end-users) organized

business categories and business columns.

业务视图是业务类的集合,表示你的模型的视图,主要用在你的最终用户。每个模型能有且仅有一个业务视图。业务视图是由一组业务类和业务列按照逻辑构成的,这种逻辑是和你的组织结构和终端用户相关的。

z属性Property (Properties)

A Property represents a metadata attribute that will help richly describe your data. A "property" has an identifier (a key into a map actually) and a value. An example property might be "font" (the identifier),

with a value of "Arial" (the value).

属性表示你将详细描述的元数据的特征。一个属性有一个标识(实际是map 的键)和一个值。举个例子,“font”是表示,它的值是“arial”

z概念(Concept)

A concept is a collection of properties. A "property" has an identifier (a key into a map actually) and a value. This collection of properties is the map of attributes that you wish to apply to a particular business object, such as a business column or business table. The concept IS the metadata you define

for your business objects.

Each business object (physical table, business table, columns, etc.) has its own concept whose properties override all other studio:inherited or studio:parented concepts.

A business object may or may not have an studio:inherited concept, depending on the object and its level in the model hierarchy. For more details about about the model hierarchy and inheritance, review

the metadata business model overview.

Concepts can also be defined independent of any business object, and can be structured in an inheritance hierarchy for better organization and management of your metadata. These independent concepts can then be applied to one or more business objects as a studio:parent concept.

概念是属性的集合,一个属性有一个标识和一个值。属性的集合是你想运用在一个特殊业务对象(比如业务列和业务表)的特征的映射,概念是你对业务对象定义的元数据。

每个业务对象(实体表、业务表、列等等)都有自己的概念,它的属性覆盖了其他的继承的或“父概念“。

一个业务对象可以有一个、或者没有继承概念,主要依靠这个对象和它处的模型层次,更多细节,参阅模型的继承和层次。回顾参阅metadata business model overview.

概念也可以是一些业务对象的继承定义,可以以继承层次的构造,以更好的组织管理你的元数据。这种独立的概念可以应用到一个或多个业务对象作为“父概念“。

z父概念(Parent Concept)

A Parent Concept is an independently defined, reusable concept that you can apply to a business object. While each business object has its own collection of properties, you can use a parent concept to add a pre-defined additional set of properties to the business object. If there is any overlap in properties between the parent concept and the business object's own concept, the business object's own concept

will be used first.

一个父概念是一个独立的定义,是你可以运用到业务对象的可重用的概念。当每个业务对象有它自己的属性集合,你可以用一个父概念来向业务对象添加一个预先定义的额外的属性集合。如果父概念和业务对象自己的概念的属性有什么交跌,业务对象他们自己的概念首先会被用。

z继承概念(Inherited Concept)

An Inherited Concept is a collection of properties that a business object will be assigned from a related business object. For instance, a business column is related to a physical column, so a business column

will inherit any and all concept properties from the physical column. For more details about about the model hierarchy and inheritance, review the metadata model overview.

继承概念是从相关的业务对象分派的一个业务对象的属性的集合。比如,一个

业务列和一个实体列相关,那么这个业务列继承任何所有的实体列概念属性。更多

细节参阅模型层次和继承,回顾参阅metadata model overview.

01. 元数据业务模型概述( Metadata

Business Model Overview )

z Introduction

Pentaho Metadata Editor (PME) is a tool that builds Pentaho metadata domains and models. A Pentaho metadata model maps the physical structure of your database into a logical business model.

These mappings are stored in a centralized metadata repository and allow administrators to:

?create business-language definitions for complex or cryptic database tables

?decrease the cost and impact associated with low level database changes

?set security parameters limiting user's report access to data

?drive formatting on text, date and numeric data improving report maintenance

?localize the information depending on the user's regional settings.

The metadata business model is actually one major component in a Pentaho metadata domain. The domain encapsulates both the physical descriptions of your database objects and the logical model (the business model), the abstract representation of the database. The purpose of this page is to give

a Pentaho metadata consumer a good idea of the relationships involved throughout a Pentaho domain,

in order to fully leverage the power of the business models.

The following diagram depicts the business objects and relationships within the Pentaho metadata

domain:

In this diagram, both the relationship arrows in the legend as well as the color of the business objects

are significant.

Each independent business object has its own color; thus, connections, physical tables, business tables, physical columns, business columns and categories are the main business objects in a Pentaho domain. The two types of relationships depicted in the diagram are key to understanding Pentaho metadata.

The "inherits from" relationship is a relationship between two different business objects where one will inherit metadata from the other, with the ability to override any of the inherited metadata properties.

An example of this is that a business table inherits metadata from its associated physical table.

The second relationship, the "same object" relationship, is one where the two business objects are actually one and the same, just represented in a different organization or duplicated for usability. This is the relationship between business columns in the Abstract Business Layer, and business columns in

the Business View.

Pentaho元数据编辑器是构建pentaho元数据域和模型的工具。一个pentaho 元数据模型是你的数据库实体结构映射的逻辑业务模型。这些映射被集中存储在元数据资源库,允许管理员作如下事情:

对复杂的模糊的数据库表创建业务语言的定义。

降低由于低层数据库改变引起的影响和变动。

设置安全参数以限制用户的报表访问数据。

对文本、日期、数字等数据格式化以提高报表的稳定性。

依据用户的地域性设置把信息本地化。

元数据业务模型是pentaho元数据域中一个主要的组成部分。pentaho域囊括了数据库对象的实体性描述和逻辑模型(业务模型),以及数据库的抽象表示。下面的这些文字的目的是给用户明白的展示一个pentaho域和他饱含的各部分组件的关系,进而充分的理解业务模型的作用。

下图详细的阐述了业务对象和pentaho元数据域的关系。

在这幅图中箭头的方向以及业务对象的颜色都是很有意义的。

每一个独立的业务对象有它自己的颜色,这包括pentaho域中主要的业务对象:连接、实体表、业务表、实体列、业务列以及分类。

图中描述的两种关系是理解pentaho元数据的关键。

继承是两个不同的业务对象,其中一个继承另一个的元数据,继承者具有覆盖被继承的元数据。继承关系的一个例子是业务表从相关的实体表继承元数据。

第二种关系是相同对象关系,两个业务对象实际上是一个,相同的,只是以一种不同的方式表示或者是一个对象的复用。在抽象业务层的业务列和业务视图中的业务列的关系就是同一个关系。

z The Physical Layer

The Physical Layer of a Pentaho domain encompasses connections, physical tables and physical columns. These objects represent the database(s) you are trying to model and enrich with metadata. The Physical Layer is not considered part of the business model, because not all connections defined in the Physical Layer will be used in every business model.

Today, multiple business models can be created in one domain, but those models must have one and only one connection reference. This means that you cannot mix and match physical tables from two different connections in the same model yet. We realize that this prevents the model from supporting multiple data sources, as well as severely limits the Pentaho Metadata Editor's ability to allow table changes across connections, a feature necessary for moving from dev to production databases, for example. Fortunately, you can get around the latter with a bit of hand-editing XML. Note that these features are important to us and are early inclusions in the metadata roadmap.

一个pentaho域的实体层包括连接、实体表和实体列。这些对象代表了你正用元数据建模和细化的数据库。实体层并不被认为是业务模型,因为不是所有在实体层定义的连接都会在每个业务模型中用到。

现在,多个业务对象模型可以在一个pentaho域中被创建,但是这些模型必须有一个且仅有一个相关的连接。这意味着你还不能在一个相同的模型中从两个不同的连接混合或匹配实体表。我们意识到这样会使模型不能支持多数据源,还严格限制了pentaho元数据编辑器允许表改变连接的能力,比如以后要从一个设备转移到产品数据库的需求。幸运的是你可以通过手工编辑对应的xml文件方式摆脱这种困境。记住这些特征对我们很重要,而且一直是实现原数据的目标。

z The Abstract Business Layer

The Abstract Business Layer is the heavy lifter in the metadata business model. The business model encompasses the Abstract Business Layer and the Business View.

In the Abstract Business Layer, you have business tables, business columns and relationships.

You can create business tables for any physical table you have defined in the Physical Layer. You can also create more than one business table to reference the same physical table. The same rules apply for business columns. This can be useful in a multitude of scenarios, filtering security or even data at

this level being one example.

The business table keeps a reference to the physical table that it models, and this allows a metadata inheritance relationship between physical tables and business tables. If you define metadata on a physical table, the business table will inherit that metadata, unless and until the business table itself has overridden the inherited metadata. This concept operates on a metadata property to property basis, meaning that each property can be overridden individually.

The same relationship exists between physical columns and business columns. If you define metadata on a physical column, the business column will inherit that metadata, unless and until the business column itself has overridden the inherited metadata.

Then there are relationships. Don't confuse the term relationships here with the relationships I described in the domain diagram. The relationships I am talking about here are not represented in that

diagram. These relationships are mappings that you define to describe the relational (or other) bonds between your business tables. One example may be a one to many relationship between a customer table and an orders table. The strength in metadata relationships is that you can identify relationships between columns or tables in the Abstract Business Layer that may not be obvious in the Physical Layer. An example would be to compare budget, actual and sales numbers, using a formula-derived business

column that is specific to your business rules.

抽象业务层在元数据业务模型中是及其重要的,业务模型包括抽象业务层和业务视图。

在抽象业务层中,包括业务表、业务列和关系。

你可以为任何你已经在实体层定的实体表创建业务表,也可以对一个相关的实体表创建多个对应的业务表。这个规则同样适用于业务列。这对于复杂、模糊、安全性要求比较高的数据这样做是非常有用的。

业务表保持了对实体表的一个参照,这样允许他们之间是一种元数据继承关系。如果实体表定义了一种元数据,那么业务表也就继承了这种元数据,除非业务表的自己定义的元数据覆盖了它。

接着是关系,请不要混淆这个词,这里的关系不是pentaho域中描述的关系,这种关系是业务表的映射关系。一个例子是,顾客和订单表之间是一对多的关系。在抽象业务层确定的表或列的关系,在实体层是不能区分的,这就是元数据关系的作用。可以举个例子,预算、实际销售、和销售数量,用一个有公式的业务列来指定你的业务规则。

z The Business View

The Business View is the part of the business model that applications will operate against, and end-users will see. The Business View is nothing more than "buckets" (called categories) for you to re-arrange and re-organize your business columns in a fashion that makes sense to the consumers of

the data.

In the Business View, you can create any number of categories and arrange your business columns in those categories however it best suits your business, mixing and matching columns that derived from different business tables, even adding business columns more than once to different categories. The only restriction to the Business View is you cannot add the same business column more than once to

a single category.

业务视图是业务模型的一部分,应用程序将对它操作,终端用户也将看见它。业务视图不过是一个“buckets”(叫分类),因为你可以多少重新排列、组织你的业务列使它对数据的用户更有意义。

在业务视图中,你可以创建任意数量的分类,在这些分类上重新排列你的业务列,但是最好符合你的业务,从不同的业务表抽取、混合数据,甚至对不同的分类添加多个业务列。业务视图唯一的限制是不能对同一个分类添加相同的业务列。

z Incorporating Metadata

Each business object in the domain can have metadata associated with it, with the exception of categories. In Pentaho terminology, a collection of metadata properties is called a concept.

Each business object can have 3 levels of concepts: its own (self or child concept), a parent concept and an inherited concept. All of the defined metadata properties from the three levels are available (to metadata-aware applications, end-users) on a business object. It is important to understand what the override hierarchy is, when more than one concept level has defined the same metadata property.

When a metadata property is set in the business object itself, and that same property is in play (either inherited or from a parent concept), the business object's self or child concept will override all others. The next level in the hierarchy is the parent concept. So, for example, say the name property on a business object is NOT set on the object itself, but IS inherited from its physical counterpart. Then you set a parent concept on the business object with a name property defined in the parent concept. The parent concept's name property would override the inherited name property. So, in the end, the inherited metadata properties are used if and only if the same property is not defined somewhere down the hierarchical chain, as part of a self concept or parent concept.

For more clarification, check out the Metadata Concepts Technical Overview, which outlines the

concept inheritance and override rules.

在pentaho域的每一个业务对象都有元数据与它关联,分类除外。在pentaho 术语中,一个元数据属性的集合叫做概念。

每个业务对象有三个层次的概念,它本身(自身或孩子概念),父概念,继承概念。业务对象所有在这三个层次的定义的元数据属性的集合对应用程序和用户都是可以用到的。当在相同的元数据属性上定义多层概念时,理解覆盖是很重要的。

当一个业务对象给自身设置一个元数据属性时,相同的属性在父概念也有,业务对象自己或它孩子的属性将覆盖其他的属性。在层次上,下一级是父概念。比如说,名字属性不是业务对象自己设置的,而是来自对应的实体对象。然后你用一个名字属性设置到你的父概念上。父概念的名字将覆盖继承的名字。所以,最后,如果在层次连上没有定义相同的属性,或者相同的属性只是自己概念或父概念的一部分,那么继承的元数据将其作用。

相关主题
相关文档
最新文档