Multivac is a “Unified Big Data Analytics Platform”
						launched in 2015. It hosts more than 360 billion web pages, social media, and
						other
						digital
						content.
						
						Multivac offers three main services: Multivac Dashboards
						(interactive visualizations), Multivac API Engine
						(real-time REST
						APIs), and Multivac DSL (interactive
						and
						collaborative notebooks for Big Data).
					
We've built scientific tools to dive into large-scale social media and Web data
 +337B
								
									+337B
								
							 +4B
								
									+4B
								
							 +2B
								
									+2B
								
							 +2.5B
								
									+2.5B
								
							 +1B
								
									+1B
								
							 +1B
								
									+1B
								
							 +1B
								
									+1B
								
							 +710M
								
									+710M
								
							 +410M
								
									+410M
								
							 +80M
								
									+80M
								
							 
							Interactive and real-time dashboards.
Multivac Platform offers curated dashboards to support common scientific use-cases among a variety of research topics such as Climate Change, international and French media, French political communities, Healthcare, and many more.
Design, develop, and build your new research projects.
										The Multivac API provides access to data stored on Multivac programmatically in
										unique and
										advanced ways. The Multivac Platform offers a complete set of REST APIs to
										access raw and aggregated
										data in real-time to use in your research projects, build interactive or offline
										visualizations, and
										more.
										
										Multivac uses Swagger to design, build and document all RESTful APIs. Swagger is
										a powerful
										open-source
										framework backed by a large ecosystem of tools. It also follows the Open API
										Initiative (OAI) to
										standardise
										how Multivac REST APIs are described.
									
 
							Interactive notebooks for Big Data.
										We have designed and implemented an Apache
											Hadoop cluster
										over more
										than 40 servers inside our private Cloud.
										
										Both Hadoop and Cloud allow us for faster elasticity in scaling our
										infrastructure and access to
										components such as YARN, HDFS, Apache Spark,
										and Apache Hive to run any number of
										jobs in parallel
										at scale over hundreds of millions of data.
									
 
								 
								
										Multivac offers interactive Hadoop notebooks by hosting
										multi-users/multi-tenants Apache
											Zeppelin and Hue.
										
										In addition to notebook-powered analytics, users can submit their codes and jobs
										over Multivac
										Hadoop Cluster by using Apache Spark interactive shell and Spark submit or
										Multivac hosted
										interactive notebooks in Scala, Java, Python, R and SQL.
									
										Multivac is hosting large-scale datasets such as Wikimedia Pageviews (+337
										billion), 4chan (+220
										million),
										or PubMed (+30 million) on its Hadoop Cluster (HDFS).
										
										These datasets are accessible to all Multivac DSL users to run massively
										distributed jobs against
										them for their research projects.
									
 
								Multivac Showcase
We showcase some of our scientific projects built and powered by using the Multivac Platform. We used the Multivac API engine for real-time data visualizations and Multivac DSL for long-term aggregated studies.