janusgraph-图数据库的学习(1)

摘要:
图形数据库是一种非关系型数据库,它应用图形理论存储实体之间的关系信息。最常见例子就是社会网络中人与人之间的关系。关系型数据库用于存储“关系型”数据的效果并不好,其查询复杂、缓慢、超出预期,而图形数据库的独特设计恰恰弥补了这个缺陷2.图数据库的数据结构图数据库包含两种基本数据类型:Nodes(节点)和Relationships(关系)。Nodes通过Relationships所定义的关系相连起来,形成关系型网络结构。

图数据库的简介-来源百度百科

1.简介

图形数据库是NoSQL数据库的一种类型,它应用图形理论存储实体之间的关系信息。图形数据库是一种非关系型数据库,它应用图形理论存储实体之间的关系信息。最常见例子就是社会网络中人与人之间的关系。关系型数据库用于存储“关系型”数据的效果并不好,其查询复杂、缓慢、超出预期,而图形数据库的独特设计恰恰弥补了这个缺陷

2.图数据库的数据结构

图数据库包含两种基本数据类型:

Nodes(节点) 和 Relationships(关系)。

Nodes 和 Relationships 包含key/value形式的属性。Nodes通过Relationships所定义的关系相连起来,形成关系型网络结构。

3.janusgraph

注:本人学习参考的是官方文档和其他学习资料,如有错误请指出

1.janusgraph的优点

JanusGraph is designed to support the processing of graphs so large that they require storage and computational capacities beyond what a single machine can provide.Scaling graph data processing for real time traversals and analytical queries is JanusGraph’s foundational benefit.This section will discuss the various specific benefits of JanusGraph and its underlying, supported persistence solutions.

上述可以理解为:设计 JanusGraph 是为了支持处理如此大的图,以至于它们需要超出单台机器所能提供的存储和计算能力。 为实时遍历和分析查询缩放图形数据处理是 JanusGraph 的基本优势

1.1基本优势

  • Support for very large graphs. JanusGraph graphs scale with the number of machines in the cluster.
  • Support for very many concurrent transactions and operational graph processing. JanusGraph’s transactional capacity scales with the number of machines in the cluster and answers complex traversal queries on huge graphs in milliseconds.
  • Support for global graph analytics and batch graph processing through the Hadoop framework.
  • Support for geo, numeric range, and full text search for vertices and edges on very large graphs.
  • Native support for the popular property graph data model exposed byApache TinkerPop.
  • Native support for the graph traversal languageGremlin.
  • Easy integration with theGremlin Serverfor programming language agnostic connectivity.
  • Numerous graph-level configurations provide knobs for tuning performance.
  • Vertex-centric indices provide vertex-level querying to alleviate issues with the infamoussuper node problem.
  • Provides an optimized disk representation to allow for efficient use of storage and speed of access.
  • Open source under the liberalApache 2 license.

1.2和hbase的集成

  • Tight integration with theApache Hadoopecosystem.
  • Native support forstrong consistency.
  • Linear scalability with the addition of more machines.
  • Strictly consistentreads and writes.
  • Convenient base classes for backing HadoopMapReducejobs with HBase tables.
  • Support for exporting metrics viaJMX.
  • Open source under the liberal Apache 2 license

1.3.JanusGraph and the CAP Theorem

Despite your best efforts, your system will experience enough faults that it will have to make a choice between reducing yield (i.e., stop answering requests) and reducing harvest (i.e., giving answers based on incomplete data). This decision should be based on business requirements.

--Coda Hale

When using a database, theCAP theoremshould be thoroughly considered (C=Consistency, A=Availability, P=Partitionability). JanusGraph is distributed with 3 supporting backends:Apache Cassandra,Apache HBase, andOracle Berkeley DB Java Edition. Note that BerkeleyDB JE is a non-distributed database and is typically only used with JanusGraph for testing and exploration purposes.

HBase gives preference to consistency at the expense of yield, i.e. the probability of completing a request. Cassandra gives preference to availability at the expense of harvest, i.e. the completeness of the answer to the query (data available/complete data).

CAP定理的简介:C =一致性,A =可用性,P =可分区性 -----https://en.wikipedia.org/wiki/CAP_theorem

2.janusGraph的整体架构

Data storage:

Indices, which speed up and enable more complex queries:

应用程序和Janusgraph进行交互

  • 将JanusGraph嵌入到执行Gremlin查询的应用程序中,直接针对同一JVM中的图形。查询执行,JanusGraph的缓存和事务处理都发生在与应用程序相同的JVM中,而从存储后端进行的数据检索可能是本地的或远程的。
  • 通过向服务器提交Gremlin查询来与本地或远程JanusGraph实例交互。JanusGraph本身支持Apache TinkerPop堆栈的Gremlin Server组件

Janusgraph的架构

janusgraph的架构

架构分为三层:

客户端使用层,业务分析层,存储层

业务分析层:联机事务处理和联机分析处理

免责声明:文章转载自《janusgraph-图数据库的学习(1)》仅用于学习参考。如对内容有疑问,请及时联系本站处理。

上篇开发板编译./camera显示-/bin/sh: ./camera: not found解决方案Kadane Algorithm下篇

宿迁高防,2C2G15M,22元/月;香港BGP,2C5G5M,25元/月 雨云优惠码:MjYwNzM=

相关文章

Hadoop 管理监控工具:Apache Ambari

Apache Ambari是一种基于Web的工具,支持Apache Hadoop集群的供应、管理和监控。Ambari目前已支持大多数Hadoop组件,包括HDFS、MapReduce、Hive、Pig、 Hbase、Zookeper、Sqoop和Hcatalog等。 Apache Ambari 支持HDFS、MapReduce、Hive、Pig、Hbase...

Apache 日志设置不记录指定文件类型的方法和日志轮

Apache日志精准的记录了Web访问的记录,但对于访问量很大的站来说,日志文件过大对于分析和保存很不方便。可以在http.conf(或虚拟主机设置文件httpd-vhosts.conf)中进行设置,限制日志不记录指定文件类型,减少日志文件空间占用。 〈FilesMatch “.(ico|gif|jpg|swf)”〉SetEnv IMAG 1〈/Files...

后台发送请求,HttpClient的post,get各种请求,带header的请求

HttpClient依赖jar包: <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.5.2</...

什么是WEBserver? 经常使用的WEBserver有哪些?

什么是WEBserver? 经常使用的WEBserver有哪些?   一、什么是WEBserver    Webserver能够解析HTTP协议。当Webserver接收到一个HTTP请求,会返回一个HTTP响应,比如送回一个HTML页面。为了处理一个请求Webserver能够响应一个静态页面或图片,进行页面跳转或者把动态响应的产生托付给一些其他的程序比...

apache 访问日志access_log 配置和解析 rotatelogs分割日志

一、解析访问日志        apache 的访问日志记载着大量的信息,学会高效快捷的读出其中关键信息对我们的工作有极大帮助。       如果Apache的安装方式是默认安装,服务器一运行就会有两个日志文件生成。       这两个文件是 access_log(在Windows上是access.log)                        ...

关于 Apache 屏蔽 IP 地址

Apache屏蔽IP地址有很多麻烦,需要支持.htaccess,没添加或取消一个IP地址都需要重启Apache服务; 而且在.htaccess中屏蔽IP地址,不是永久屏蔽就是永久接触,可控性很差; <Limit GET HEAD POST> order allow,deny deny from 110.85.104.152 deny from...