• en  English
    • it Italiano (Italian)
CRS4

Center for Advanced Studies, Research and Development in Sardinia

  • Home
  • OUR FIRST
    30 YEARS
    • PRESS CONFERENCE
    • HOW CRS4 STARTED
    • 30 YEARS OF ACHIEVEMENTS
    • 30 YEARS ON THE MEDIA
    • TALKING ABOUT US
    • PATRONAGE
    • SPONSORS
    • LEGAL REPRESENTATIVES
    • SUPPORT US
  • Research
    • Biosciences
    • Computational Infrastructure and Smart Projects
    • Digital Technologies for Aerospace
    • HPC for Energy and Environment
    • ICT - Information Society
    • Visual and Data-Intensive Computing
  • DEVELOPMENT
    AND SERVICES
    • Next Generation Sequencing Core Facility
    • High Performance Computing services
    • Technology Catalogue
    • Open Source Software - OSS
  • Projects
    • Scientific schools
    • Projects
  • Customers and Partnerships
    • Customers
    • Partnerships
  • Results
    • Publications
    • Patents and trademarks
    • Awards
  • PRESS AND MEDIA
    • Press Kit
    • Research focus
    • Press Office
    • News
    • Press Release
    • Press Review
    • Video Gallery CRS4
    • Video Visual Computing
    • Events
  • About us
    • Organization
    • Organization Chart
    • Investee companies
    • Contacts
    • People
    • Working with us

Pydoop

  • Home
  • Results
  • Technology Catalogue
  • Pydoop
Technology Catalogue
ENERGY & ENVIRONMENT
CO2 BIOFIXATION WITH MICROALGAE
CRS4-2
EIAGRID
ICT
ACTIVE
CORVO3D
GREEN
HYPERGATE
HYPERPEER
IN3-3D
MADAME3D
MULTISPECTRAL RTI CAPTURE
OPINION MINING FRAMEWORK
OVERTHEVIEW
PICKAROCK
PRODIS
RESTAUR
SOCIAL WALL
VISUALIZATION AND ANALYSIS OF MLICs
X-PLACES
LIFE SCIENCES
CRS4TELEMED
CYTEST
HEALTH GATEWAY
HL7APY
JEENK
OME SEADRAGON
OMERO BIOBANK
PROMORT
PYDOOP
PYEHR
RAPI
SEAL
WFT4GALAXY

Pydoop

Pydoop a Python interface for Apache Hadoop

Contacts

Simone Leo, Gianluigi Zanetti. E-mail: valorisation@crs4.it

Challenge

Over the years, the list of tools for big data analysis kept growing constantly. However, not all of them offer a multi-language API. Apache Hadoop, for instance, is written in Java and expects users to write their applications in Java. Due to the overwhelming popularity of Python across all domains, most notably scientific computing, it is highly desirable to bring its rich toolset to the Hadoop environment.

Overview

Pydoop is a Python interface for Apache Hadoop, which covers both HDFS access and MapReduce job submission.

Innovative features

  • simple to use;
  • compatible with most existing Python libraries, including SciPy and NumPy (it’s built as a CPython extension).

Potential users

Anyone that needs to process huge amounts of data in Python.

Impact sectors

Distributed computing - scientific computing - big data analysis.

Other resources

  1. https://crs4.github.io/pydoop/
  2. S. Leo, G. Zanetti, Pydoop: a Python MapReduce and HDFS API for Hadoop. Proceeding HPDC '10, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. Pages 819-825 Chicago, Illinois - June 21 - 25, 2010.

 

  • Home
  • About us
  • Contacts
  • Press Kit
  • Visit CRS4
  • Tenders
  • Job positions
  • Modello organizzativo 231
  • Amministrazione trasparente
  • Intranet

 

  • Biosciences
  • Computational Infrastructure and Smart Projects
  • Digital Technologies for Aerospace
  • HPC for Energy and Environment
  • ICT - Information Society
  • Visual and Data-Intensive Computing
  • Next Generation Sequencing Platform
  • Publications
  • Projects

FOLLOW US

  • Facebook
  • Twitter
  • Youtube
  • Linkedin
  • Slideshare

Our location

Loc. Piscina Manna, Edificio 1
09050 Pula (CA)

Via Ampere 2
09134 Cagliari (CA)

Phone: +39 070 92501

Fax: +39 070 9250216

Email: info@crs4.it

PEC: crs4@legalmail.it

CRS4 ©  1993 - 2021 | Privacy Policy | P.IVA 01983460922