Friday, November 30, 2012

Bad performance of Hive meta store for tables with large number of partitions

Just found this article Batch fetching - optimizing object graph loading

We have some tables with 15K ~ 20K partitions. If I run a query scanning a lot of partitions, Hive could use more than 10 minutes to commit the mapred job.

The problem is caused by ObjectStore.getPartitionsByNames when Hive semantic analyzer tries to prune partitions. This method sends a lot of queries to our MySQL database to retrieve ALL information about partitions. Because MPartition and MStroageDescriptor are converted into Partition and StorageDescriptor, every field will be accessed during conversion, in other words, even the fields has nothing to do with partition pruning, such as BucketCols. In our case, 10 queries for each partition will be sent to the database and each query may take 40ms.

This is known ORM 1+N problem. But it is really bad user experience.

Actually we assembly Partition objects manually, it would only need about 10 queries for a group of partitions (default size is 300). In our environment, it only needs 40 seconds for 30K partitions: 30K / 300 * 10 * 40.

I tried to this way:

  1. Fetch MPartition with fetch group and fetch_size_greedy, so one query can get MPartition's primary fields and MStorageDescriptor cached.
  2. Get all descriptors into a list "msds", run another query to get MStorageDescriptor with filter like "msds.contains(this)", all cached descriptors will be refreshed in one query instead of n queries.

This works well for 1-1 relations, but not on 1-N relation like MPartition.values. I didn't find a way to populate those fields in just one query.

Because JDO mapping doesn't work well in the conversion (MPartition - Partition), I'm wondering if it is worth doing like this:

  1. Query each table in SQL directly PARTITIONS, SDS, etcs.
  2. Assembly Partition objects

This is a hack and the code will be really bad. But I didn't find JDO support "FETCH JOIN" or "Batch fetch".

Tuesday, November 27, 2012

CDH4.1.0 Hive Debug Option Bug

Hive 0.9.0 supports remote debug.Run the following command, and Hive will suspend and listening on 8000.

hive --debug
But there is a bug in Hive CDH4.1.0 which blocks you from using this option. You will get this error message:
[bewang@myserver ~]$ hive --debug
ERROR: Cannot load this JVM TI agent twice, check your java command line for duplicate jdwp options.
Error occurred during initialization of VM
agent library failed to init: jdwp

By setting xtrace, I found "-XX:+UseParallelGC -agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=y" is actually add twice. In this commit 54abc308164314a6fae0ef0b2f2241a6d4d9f058, HADOOP_CLIENT_OPTS is appended to HADOOP_OPTS, unfortunately this is done in $HADOOP_HOME/bin/hadoop".

--- a/bin/hive
+++ b/bin/hive
@@ -216,6 +216,7 @@ if [ "$DEBUG" ]; then
     get_debug_params "$DEBUG"

Removing this line will fix the issue.

Wednesday, November 21, 2012

Using Puppet to Manage Java Application Deployment

Deployment of a Java application can be done in a different ways. We can do

  • Build one jar with all exploded dependent jars and your application. You don't have to worry about your classpath.
  • Build a tarball. It is much easier using Maven assembly and Maven AppAssembly plugin to build a tarball/zip to include all dependent jars and the wrapper scripts for running application or service. AppAssembly generate the classpath in the wrapper scripts.
  • Build a war and deploy it to the container.

All above methods have the same issue: you will repeatedly deploy the same jars again and again. One of my project uses a lot of third-party libraries, Spring, Drools, Quartz, EclipseLink and Apache CXF. The whole zip file is about 67MB with all transitive dependent jars, but my application is actually less than 1MB. Every time I have to build and store such large files, and include all dependent jars without any change. Even the hard drive is much cheaper today, doing like this is still not a good practice.

It is easy to deploy my application jar and replace the old jar only each time if I don't update any dependency. Otherwise, it is still inconvenient because you have to change multiple jars.

The ideal way is to deploy the application jar and dependent jars in a directory other than the application directory on the target server, and build the classpath each time when a new version or dependency change is deployed. There are the following advantages:

  • Dependent jars only needs to be deployed once.
  • Different applications can share the same dependencies.
  • Multiple versions exist on the target server. It is much easier to rollback to the old version.
  • You save a lot of space and network traffic

You probably get the answer: Maven. Using maven, maven dependency plugin, and maven local repository, you can simply and easily implement such a system.

I have a working puppet module can do the followings:

  • setup maven from tarball;
  • setup maven settings.xml, local repository, and you maven repository on the intranet;
  • resolve the dependencies of a maven artifact: download jars into the local repository;
  • setup symlinks of dependent jars into a directory so that you don't have different copies for different applications;
  • generate a file having the classpath.

I will publish it to github once I get the time.

You can find similar puppet modules in github, like puppet-nexus and puppet-maven. They just copy the specified jars to a directory.

Monday, November 19, 2012

Start Puppet Master As Non-Root User

My company has a lengthy process if we need root permission to deploy something into a Linux server, 3-4 days for approval. Sound impossible? Unfortunately it is true. And I don't want to run my puppet modules under root permission because they have to be managed by IT team and it could take even longer to deploy and maintain those modules. My goal is to avoid any root permission as possible as I can.

Actually it is possible that run multiple puppet agents in one machine. One agent is managed by IT, who handles the stuffs I don't care, such as NTP. One agent can run under my control with a non-root user with some sudo privileges, like run /sbin/service, etc. And I can manage a puppet master and config my agents to connect to this master to do the deployment.

It is actually pretty simple, even without any configuration, just provide confdir, vardir and server in the command line.

puppet master --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var --modulepath=/home/bewang/modules --pluginsync --no-deamonize --debug --verbose
puppet agent --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var server=my-pp-master --test
Of course, when you run puppet agent for the first time, it will fail because the certificate is not signed yet. After running the following command, everything works.
puppet cert --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var list
puppet cert --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var sign my-pp-agent-host-name
For service type resource, you can provide customized start command like this:
service { "my-service":
  ensure => running,
  start => "sudo /sbin/service my-service start",

Wednesday, November 7, 2012

Tips of Jenkins and Nexus

  • Fingerprint Issue in Jenkins: delete the dir d:\.jenkins\fingerprints
  • Waiting for Jenkins to finish collecting data
    ERROR: Asynchronous execution failure
    java.util.concurrent.ExecutionException: hudson.util.IOException2: Unable to read D:\.jenkins\fingerprints\42\e9\40d5d2d822f4dc04c65053e630ab.xml
    Caused by: hudson.util.IOException2: Unable to read D:\.jenkins\fingerprints\42\e9\40d5d2d822f4dc04c65053e630ab.xml
    Caused by:  : only whitespace content allowed before start tag and not \u0 (position: START_DOCUMENT seen \u0... @1:1)
    Caused by: org.xmlpull.v1.XmlPullParserException: only whitespace content allowed before start tag and not \u0 (position: START_DOCUMENT seen \u0... @1:1)
  • Maven (3.0.3) cannot resolve snapshot artifacts in Nexus: enable snapshots for the repository in ~/.m2/settings.xml. Only one thing is not clear, if it also enable snapshots too for other repositories because of the mirror setting.