Jump to Navigation

Data Aquisition

BIG Final Event Workshop

Programme of the BIG Final Event Workshop co­located with ISC Big Data in Heidelberg 

 

The Big Project

Welcome and Introduction Nuria De Lama (ATOS Spain)  

 

Key Technology Trends for Big Data in Europe 

Edward Curry (Research Fellow at Insight @ NUI Galway) 

presentation

 

The Big Data Public Private Partnership 

Nuria De Lama (ATOS Spain)

presentation 

 

Panel discussion about a common Big Data Stakeholder Platform 

Martin Strohbach (AGT International)  

● The PPP Stakeholder Platform

(Nuria De Lama

● Hardware and Network for Big Data

(Ernestina Menasalvas, RETHINK BIG EU Project) 

● Tackling BIG DATA Externalities

(Kush Wadhwa, Trilateral Research BYTE EU Project) 

● The value of the Stakeholder platform

(Sebnem Rusitschka, Siemens, BIG and BYTE Project) 

 

Networking and Break-out Sessions

 

update will follow

 

 

D2.2.2 Final Version of Technical White Paper available

The final version of the technical whitepaper of deliverable 2.2.2 is available now. It details the results from the Data Value Chain Technical Working groups describing the state of the art in each part of the chain together with emerging technological trends for exploiting Big Data. It is an amalgamation of the results of the challenges in regards to big data in different sectors and working groups. The Data Value Chain identifies activities in Data Acquisition, Data Analysis, Data Curation, Data Storage and Data Usage.
A member of the BIG Health forum comments: "We interviewed experts in the biomedical domain to ask for their opinion about the current situation in data acquisition and data quality. We identified challenges that need to be addressed for establishing the basis for BIG health data applications. Additionally, the current data quality challenges were diagnosed and are reported in the deliverable."

The BIG project at Big Data World Congress, Munich, 3-4 December 2013

The BIG project had a strong presence at BIG Data World Congress in Munich in early December. There was a strategically-positioned stand  in the exhibition hall. We met a number of delegates from many industrial sectors and countries, especially in the “speed dating” session where we perfected the BIG project’s elevator pitch in the quick-fire conversations! Project flyers and stickers were available in many places for people who wanted to learn about the project after the conference. The two day event was closed by a presentation from the BIG project’s director Josema Cavanillas, introducing the aims of the project and the outputs of our research.
 
The event featured case studies and panels on every aspect of Big Data technologies including governance, unstructured data, real-time analytics and much more. Attendees came from a wide range of organisations, including some big players in sectors such as manufacturing and telecoms. One exciting potential avenue of collaboration may be for BIG to work with the USA’s NIST (National Institute of Standards and Technology) as they are also developing cross-sector consensus requirements and roadmaps for Big Data.
 
Many speakers talked about how adopting Big Data could revolutionise the ways businesses operate, driving efficiency and faster product development. It is recognised by most if not all senior level executives as one of the key IT trends of the next few years - but this comes with the caveat that Big Data initiatives need to be aligned to clear outcomes and business processes in order to have a chance of success. The structure of organisations may need to be adapted to enable technical and business expertise to work together more closely to enable value to be derived from data. Even then, the pace of industry change may be such that organisations will look to form partnerships with start-ups and universities so as to drive innovation. The BIG project’s Public Private Forum could be a key enabler for these communities.
Europe-specific issues were highlighted in several talks. There was criticism of the apparent risk aversion of technology companies and their customers and the lack of a widespread start-up culture (apart from a few isolated exemplars). There are differences between Europe and the US in terms of data protection, the EU’s tougher legislation possibly being a barrier to innovation for some firms (on the other hand, the US’s relatively lax laws may have implications for privacy and the ethics of extensive data collection by businesses).
 

Interview with Andreas Ribbrock Team Lead Big Data Analytics and Senior Architect at Teradata GmbH

Big Data Analysis Interview with Andreas Ribbrock, Team Lead Big Data Analytics and Senior Architect at Teradata GmbH, is online now:

In his interview Andreas talked about three classes of technologies required for Big Data: storage (advocating distributed file systems as a competitive way to handle these); query frameworks which can translate from user queries to a set of different query engines (calling it a 'discovery platform'); and a platform which can handle the delivery of the right results to the right personnel in the right time frame.

Andreas also stressed that integration is key as Big Data can not be solved by any single technology but requires a suite of technologies to be tightly integrated. In general, any architecture/framework for Big Data must be open and adaptable as new technologies/components are plugged in. Fabric computing where components are virtualized and allow data flow at high speeds was a possible approach to solve this.

In terms of impact two key drivers are the ability for Big Data to allow companies to personalise their communication with clients and also how user communication channels will change. On the one hand, one can integrate channels for energy consumption, phone use, banking. On the other, users may prefer their own channels (which produce a lot of data) and impose these on enterprises in specific markets. e.g. traditional banks may soon become obsolete as their functionality is taken by PayPal (a TeraData customer), Amazon and Google.

He ended the interview with the phrase: Big Data is Big Fun!

Big Data Webinar: Data Acquisition, 21th November, 16:00 CET

Recording of the session see here.
 
We are pleased to announce the start of the Big Data Webinar series and invite you to actively participate and help shape Europe's Big Data Agenda.
 
The Big Data Webinars aims to spur discussions around the findings of the technical working groups and sectorial forums of the Big Data Public Private Forum project: The technical white papers, sector’s requisites as well as the sectorial and cross-sectorial roadmaps.
 
We actively seek contributions and feedback from all stakeholders in their respective technical areas of the Big data value chain and their respective industrial sectors to help draft roadmaps for the European Big Data Community to align supply and demand, as a way of increasing competitiveness of European industries.
 
The series will start on November 21st at 16:00 CET with the discussion of the technical white paper of the working group of Data Acquisition. The session will be lead by Axel Ngonga, Data Scientist at University of Leipzig and leader of the technical working group on Data Acquisition. Current schedule of the series.
 
What to expect
The leader of the technical working group will give a short (max 15min) presentation on the main findings of the Data Acquisition working group. This presentation will be followed by an open and interactive discussion of all participants. We encourage you to participate as we seek your feedback to further develop the first draft white papers. Your contribution will directly influence Europe’s Big Data roadmaps.
 
Who should join
The Webinar is open to the public free of charge. It is tailored for professionals and technologists working in the area of Big Data, who are interested to learn more about and share their expertise on Data Acquisition.
 
How to join
The Big Data Webinars will be held using Google Hangouts on Air. Please join the Webinar by following this link:
 
Please make sure you have joined the Webinar 10 minutes before 16:00 CET on November 7th. We have published the section on Data Acquisition of the Technical Whitepaper for online comments and annotations here.
 
You can attend the Big Data Webinars series twice a month. See our updated schedule and stay tuned. All Webinars and will be recorded and published with slides and summary of the discussions afterwards for those of you who could not make it.

Tags:
Categories:

BIG at LSWT2013 - From Big Data to Smart Data - A Summary

The 5th Leipziger Semantic Web Tag (LSWT2013) was organized as a meeting point for german as well as international Linked Data experts.
Under the motto: From Big Data to Smart Data sophisticated methods that enable handling large amounts of data have been presented on September 23th in Leipzig.
The keynote was held by Hans Uszkoreit, scientific director at the German Research Center for Artificial Intelligence (DFKI). By being introduced  to Text Analytics and Big Data issues the participants of the LSWT 2013 discussed the intelligent usage of huge amounts of data in the web.
 
Presentations on industrial and scientific solutions showed working solutions to big data concerns. Companies like Empolis, Brox and Ontos presented Linked Data and Semantic Web solutions capable of handling terabytes of data. However, also traditional approaches, like Datameer’s Data Analytics Solution based on Hadoop pointed out that big data could be handled nowadays without bigger problems.
 
Furthermore, problems detecting topics in massive data streams (Topic/S), document collections (WisARD) or corpora at information service providers (Wolters Kluwer) were tackled. Even the ethical issue of robots replacing journalists by the help of semantic data has been examined by Alexander Siebert from Retresco.
 
In conclusion, the analysis of textual information in large amounts of data is an interesting and so far not yet fully solved area of work. Further Information are available from the website.
 
 Further information on topics related to data analysis, data curation, data storage, data acquisition and data usage can be found in our technical whitepaper available from our project website.

Big Data Analysis Interview with Steve Harris Chief Technology Officer at Garlik, an Experian Company

steve_harris

Check new Big Data Analysis Interview with Steve Harris, Chief Technology Officer at Garlik, an Experian Company available in the following formats:

The company that Steve is associated with has as its main focus the prediction and detection of financial fraud through the use of their customised RDF store and SPARQL. They harvest several terabytes of raw data from chat-rooms and forums associated with hackers and generate around 1B RDF triples based on this. In terms of areas that need work Steve's suggestion was the optimisation of the the performance of these stores. We also discussed the need to make sure that the infrastructure was economically viable and that training of staff to use RDF/SPARQL was not a big issue.

Steve Harris is a lead design and development of a multi million user product in the financial services industry at Garlik, Experian Company. In the Semantic Web community, he is widely regarded as the architect of Garlik's open source, scalable RDF platform, 5store, and has served on the World Wide Web Consortium (W3C) working groups that defined the SPARQL query [1].

Big Data Analysis Interview with Ricardo Baeza-Yates VP of Research for Europe and Latin America at Yahoo!

Ricardo Baeza-Yates

Check out new Big Data analysis interview with Ricardo Baeza-Yates, VP of Research for Europe and Latin America at Yahoo!

 

 

The main suggested themes to invest in by Ricardo are:

a) what he called Hadoop++ the ability to handle graphs with trillions of edges as MapReduce doesn't scale well for graphs; and b) stream data mining - the ability to handle streams of large volumes of data. Handling lots of data in a 'reasonable' amount of time is key for Ricardo - for example, being able to carry out offline computations within a week rather than a year.

Additional point of interest of Ricardo was personalisation and its relation to privacy. Rather than personalising based on user data we should personalise around user tasks. More details in the interview!

Ricardo Baeza-Yates is VP of Research for Europe and Latin America, leading the Yahoo! Research labs at Barcelona, Spain and Santiago, Chile, and also supervising the lab in Haifa, Israel. Until 2005 he was the director of the Center for Web Research at the Department of Computer Science of the Engineering School  of the University of Chile; and ICREA Professor and founder of the Web Research Group at the Dept. of Information and Communication Technologies of Universitat Pompeu Fabra in Barcelona, Spain [1].

Big Data Analysis Interview with Peter Mika, Senior Scientist at Yahoo! Research Labs in Barcelona

New Big Data analysis interview with Peter Mika is out:

Within the standard interface

As an MP3 audio file

As a small video-only window

As a large video window with ability to jump to a specific segment

The main theme for Peter was on using a machine learning, information extraction and semantic web technologies to reduce Big Data into more manageable chunks and how combined with new programming paradigms such as Hadoop we could now accomplish more. Background knowledge (in a simple form) enables Yahoo1 to understand that "Brad Pitt Fight Club" is a search for a movie with Brad Pitt playing the role of disambiguation.

Peter Mika is a researcher working on the topic of semantic search at Yahoo Research in Barcelona, Spain. Peter also serves as a Data Architect for Yahoo Search, advising the team on issues related to knowledge representation. Peter is a frequent speaker at events, a regular contributor to the Semantic Web Gang podcast series and a blogger at tripletalk.wordpress.com [1].

Big Data Analysis Interview with Usman Haque, Pachube Founder and Director Urban Projects Division COSM

Usman Haque

Check out our big data analysis interview with Usman Haque, Pachube Founder and Director Urban Projects Division COSM:

Within the standard interface

As an MP3 audio file

As a small video-only window

As a large video window

Usman mostly covered a community oriented view to Big Data Acquisition which he says is very important if citizens and communities are to fully engage with important issues in the world. Key here is the fact that the community can overcome any deficiencies (errors or heterogeneities) by creating their own specific tools.

Usman Haque has worked a lot with interactive environments over the years, founded a web platform for building internet-connected devices, buildings and environments for storing, sharing and discovering of real time sensor, energy and environmental data, known as Pachube, acquired by LogMeln in 2011. Later on, Usman took part in launching COSM.com platform where he was heading up urban projects that dealt with data, sensors and internet of things.

Pages

Cialis sales are available on many trusted Internet sites. In humans, cialis has no effect on bleeding time when taken alone or with aspirin.

Subscribe to Data Aquisition


Main menu 2

by Dr. Radut