#!/usr/bin/env ruby require 'rubygems' require 'pty' require 'expect' require 'io/console' hosts_file = ARGV[0] || "hosts" print "Password:" password = $stdin.noecho(&:gets) password.chomp! puts $expect_verbose = true File.open(hosts_file).each do |host| host.chomp! print "Copying id to #{host} ... " begin PTY.spawn("ssh-copy-id #{host}") do |cp_out, cp_in, pid| begin pattern = /#{host}'s password:/ cp_out.expect(pattern, 10) do |m| cp_in.printf("#{password}\n") end cp_out.readlines rescue Errno::EIO ensure Process.wait(pid) end end rescue PTY::ChildExited => e puts "Exited: #{e.status}" end status = $? if status == 0 puts "Done!" else puts "Failed with exit code #{status}!" end end
Thursday, December 27, 2012
Have to read all from Ruby PTY output
Tuesday, December 11, 2012
Puppet require vs. include vs. class
According to puppet reference include and require:
- Both include and require are functions;
- Both include and require will "Evaluate one or more classes";
- Both include and require cannot handle parametered classes
- require is a superset of include, because it "adds the required class as a dependency";
- require could cause "nasty dependency cycle";
- require is "largely unnecessary"; See puppet language guide.
Puppet also has a require function, which can be used inside class definitions and which does implicitly declare a class, in the same way that the include function does. This function doesn’t play well with parameterized classes. The require function is largely unnecessary, as class-level dependencies can be managed in other ways.
- We can include a class multiple times, but cannot declare a class multiple times.
class inner { notice("I'm inner") file {"/tmp/abc": ensure => directory } } class outer_a { # include inner class { "inner": } notice("I'm outer_a") } class outer_b { # include inner class { "inner": } notice("I'm outer_b") } include outer_a include outer_b
Duplicate declaration: Class[Inner] is already declared in file /home/bewang/temp/puppet/require.pp at line 11; cannot redeclare at /home/bewang/temp/puppet/require.pp:18 on node pmaster.puppet-test.com
- You can safely include a class, first two examples pass, but you cannot declare class inner after outer_a or outer_b like the third one:
class inner { } class outer_a { include inner } class outer_b { include inner } class { "inner": } include outer_a include outer_b
class { "inner": } class { "outer_a": } class { "outer_b": }
include outer_a include outer_b class { "inner": } # Duplicate redeclaration error
Thursday, December 6, 2012
Hive Metastore Configuration
Recently I wrote a post for Bad performance of Hive meta store for tables with large number of partitions. I did tests in our environment. Here is what I found:
- Don't configure a hive client to access remote MySQL database directly as follows. The performance is really bad, especially when you query a table with a large number of partitions.
javax.jdo.option.ConnectionURL jdbc:mysql://mysql_server/hive_meta javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver javax.jdo.option.ConnectionUserName hive_user javax.jdo.option.ConnectionPassword password
- On database server, use the same configuration as above
- Start the hive metasore service
hive --service metastore # If use CDH yum install hive-metastore /sbin/service hive-metastore start
hive.metastore.uris thrift://mysql_server:9083
ERROR conf.HiveConf: Found both hive.metastore.uris and javax.jdo.option.ConnectionURL Recommended to have exactly one of those config keyin configurationThe reason is: when Hive does partition pruning, it will read a list of partitions. The current metastore implementation uses JDO to query the metastore database:
- Get a list of partition names using db.getPartitionNames()
- Then call db.getPartitionsByName(List<Strin> partNames). If the list is too large, it will load in multiple times, 300 for each load by default. The JDO calls like this
- For one MPartition object.
- Send 1 query to retrieve MPartition basic fields.
- Send 1 query to retrieve MStorageDescriptors
- Send 1 query to retrieve data from PART_PARAMS.
- Send 1 query to retrieve data from PARTITION_KEY_VALS.
- ...
- Totally 10 queries for one MPartition. Because MPartition will be converted into Partition before send by, all fields will be populated
- If one query takes 40ms in my environment. And you can calculate how long does it take for thousands partitions.
- Using remote Hive metastore service, all those queries happens locally, it won't take that long for each query, so you can get performance improved significantly. But there are still a lot of queries.
I also wrote ObjectStore using EclipseLink JPA with @BatchFetch. Here is the test result, it will at least 6 times faster than remote metastore service. It will be even faster.
Partitions | JDO Remote MySQL |
Remote Service
| EclipseLink Remote MySQL |
10 | 6,142 | 353 | 569 |
100 | 57,076 | 3,914 | 940 |
200 | 116,216 | 5,254 | 1,211 |
500 | 287,416 | 21,385 | 3,711 |
1000 | 574,606 | 39,846 | 6,652 |
3000 | 132,645 | 19,518 |
Friday, November 30, 2012
Bad performance of Hive meta store for tables with large number of partitions
Just found this article Batch fetching - optimizing object graph loading
We have some tables with 15K ~ 20K partitions. If I run a query scanning a lot of partitions, Hive could use more than 10 minutes to commit the mapred job.
The problem is caused by ObjectStore.getPartitionsByNames when Hive semantic analyzer tries to prune partitions. This method sends a lot of queries to our MySQL database to retrieve ALL information about partitions. Because MPartition and MStroageDescriptor are converted into Partition and StorageDescriptor, every field will be accessed during conversion, in other words, even the fields has nothing to do with partition pruning, such as BucketCols. In our case, 10 queries for each partition will be sent to the database and each query may take 40ms.
This is known ORM 1+N problem. But it is really bad user experience.
Actually we assembly Partition objects manually, it would only need about 10 queries for a group of partitions (default size is 300). In our environment, it only needs 40 seconds for 30K partitions: 30K / 300 * 10 * 40.
I tried to this way:
- Fetch MPartition with fetch group and fetch_size_greedy, so one query can get MPartition's primary fields and MStorageDescriptor cached.
- Get all descriptors into a list "msds", run another query to get MStorageDescriptor with filter like "msds.contains(this)", all cached descriptors will be refreshed in one query instead of n queries.
This works well for 1-1 relations, but not on 1-N relation like MPartition.values. I didn't find a way to populate those fields in just one query.
Because JDO mapping doesn't work well in the conversion (MPartition - Partition), I'm wondering if it is worth doing like this:
- Query each table in SQL directly PARTITIONS, SDS, etcs.
- Assembly Partition objects
This is a hack and the code will be really bad. But I didn't find JDO support "FETCH JOIN" or "Batch fetch".
Tuesday, November 27, 2012
CDH4.1.0 Hive Debug Option Bug
Hive 0.9.0 supports remote debug.Run the following command, and Hive will suspend and listening on 8000.
hive --debugBut there is a bug in Hive CDH4.1.0 which blocks you from using this option. You will get this error message:
[bewang@myserver ~]$ hive --debug ERROR: Cannot load this JVM TI agent twice, check your java command line for duplicate jdwp options. Error occurred during initialization of VM agent library failed to init: jdwp
By setting xtrace, I found "-XX:+UseParallelGC -agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=y" is actually add twice. In this commit 54abc308164314a6fae0ef0b2f2241a6d4d9f058, HADOOP_CLIENT_OPTS is appended to HADOOP_OPTS, unfortunately this is done in $HADOOP_HOME/bin/hadoop".
--- a/bin/hive +++ b/bin/hive @@ -216,6 +216,7 @@ if [ "$DEBUG" ]; then else get_debug_params "$DEBUG" export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS $HIVE_MAIN_CLIENT_DEBUG_OPTS" + export HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS" fi fi
Removing this line will fix the issue.
Wednesday, November 21, 2012
Using Puppet to Manage Java Application Deployment
Deployment of a Java application can be done in a different ways. We can do
- Build one jar with all exploded dependent jars and your application. You don't have to worry about your classpath.
- Build a tarball. It is much easier using Maven assembly and Maven AppAssembly plugin to build a tarball/zip to include all dependent jars and the wrapper scripts for running application or service. AppAssembly generate the classpath in the wrapper scripts.
- Build a war and deploy it to the container.
All above methods have the same issue: you will repeatedly deploy the same jars again and again. One of my project uses a lot of third-party libraries, Spring, Drools, Quartz, EclipseLink and Apache CXF. The whole zip file is about 67MB with all transitive dependent jars, but my application is actually less than 1MB. Every time I have to build and store such large files, and include all dependent jars without any change. Even the hard drive is much cheaper today, doing like this is still not a good practice.
It is easy to deploy my application jar and replace the old jar only each time if I don't update any dependency. Otherwise, it is still inconvenient because you have to change multiple jars.
The ideal way is to deploy the application jar and dependent jars in a directory other than the application directory on the target server, and build the classpath each time when a new version or dependency change is deployed. There are the following advantages:
- Dependent jars only needs to be deployed once.
- Different applications can share the same dependencies.
- Multiple versions exist on the target server. It is much easier to rollback to the old version.
- You save a lot of space and network traffic
You probably get the answer: Maven. Using maven, maven dependency plugin, and maven local repository, you can simply and easily implement such a system.
I have a working puppet module can do the followings:
- setup maven from tarball;
- setup maven settings.xml, local repository, and you maven repository on the intranet;
- resolve the dependencies of a maven artifact: download jars into the local repository;
- setup symlinks of dependent jars into a directory so that you don't have different copies for different applications;
- generate a file having the classpath.
I will publish it to github once I get the time.
You can find similar puppet modules in github, like puppet-nexus and puppet-maven. They just copy the specified jars to a directory.
Monday, November 19, 2012
Start Puppet Master As Non-Root User
My company has a lengthy process if we need root permission to deploy something into a Linux server, 3-4 days for approval. Sound impossible? Unfortunately it is true. And I don't want to run my puppet modules under root permission because they have to be managed by IT team and it could take even longer to deploy and maintain those modules. My goal is to avoid any root permission as possible as I can.
Actually it is possible that run multiple puppet agents in one machine. One agent is managed by IT, who handles the stuffs I don't care, such as NTP. One agent can run under my control with a non-root user with some sudo privileges, like run /sbin/service, etc. And I can manage a puppet master and config my agents to connect to this master to do the deployment.
It is actually pretty simple, even without any configuration, just provide confdir, vardir and server in the command line.
puppet master --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var --modulepath=/home/bewang/modules --pluginsync --no-deamonize --debug --verbose puppet agent --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var server=my-pp-master --testOf course, when you run puppet agent for the first time, it will fail because the certificate is not signed yet. After running the following command, everything works.
puppet cert --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var list puppet cert --confdir=/home/bewang/.puppet --vardir=/home/bewang/.puppet/var sign my-pp-agent-host-nameFor service type resource, you can provide customized start command like this:
service { "my-service": ensure => running, start => "sudo /sbin/service my-service start", }
Wednesday, November 7, 2012
Tips of Jenkins and Nexus
- Fingerprint Issue in Jenkins: delete the dir d:\.jenkins\fingerprints
Waiting for Jenkins to finish collecting data ERROR: Asynchronous execution failure java.util.concurrent.ExecutionException: hudson.util.IOException2: Unable to read D:\.jenkins\fingerprints\42\e9\40d5d2d822f4dc04c65053e630ab.xml ... Caused by: hudson.util.IOException2: Unable to read D:\.jenkins\fingerprints\42\e9\40d5d2d822f4dc04c65053e630ab.xml ... Caused by: com.thoughtworks.xstream.io.StreamException: : only whitespace content allowed before start tag and not \u0 (position: START_DOCUMENT seen \u0... @1:1) ... Caused by: org.xmlpull.v1.XmlPullParserException: only whitespace content allowed before start tag and not \u0 (position: START_DOCUMENT seen \u0... @1:1)
/home/bewang/temp/maven/.m2/repository nexus true my-repo http://nexus-server:8080/nexus/content/groups/public< /url> true true nexus-public *,!eclipselink-repo http://nexus-server:8080/nexus/content/groups/pu blic nexus-eclipselink eclipselink-repo http://nexus-server:8080/nexus/content/repositor ies/eclipselink-maven-mirror
Tuesday, October 30, 2012
Hue 2.0 failed when syncdb with a MySQL database
- Don't use utf8 as default. Change /etc/my.cnf to use latin1. For latin1, varchar can hold 64K. After syncdb, just modify the tables charset to use utf8 if you still want utf8.
- Edit the migration to replace all 32678 to 12678 before running syncdb. It doesn't matter to what value you change because the developers already have the fix in migration 0003_xxx of the same directory and the table will be fixed anyway. Don't forget to delete 0002_.pyc in the directory.
Monday, October 29, 2012
Import users to Cloudera Hue 2 from Hue 1.2.0
I thought it should not be too hard to import users from Hue 1.2.0 into Hue 2 (CDH 4.1) because this page doesn't mention special steps: https://ccp.cloudera.com/display/CDH4DOC/Hue+Installation#HueInstallation-UpgradingHuefromCDH3toCDH4.
But I was wrong. After importing auth_user, auth_group, and auth_user_groups and successfully log on using my old username and password, I got "Server Error (500)" and I couldnot find any error message in the log files.
It turns out that you have to create records in useradmin_grouppermission and useradmin_userprofile tables for each user in Hue 2(CDH4.1). Here are the queries:
mysql> insert into useradmin_grouppermission(hue_permission_id,group_id) select hp.id, g.id from useradmin_huepermission hp inner join auth_group g; mysql> insert into useradmin_userprofile (user_id, creation_method, home_directory) select u.id, 'HUE', concat('/user/', u.username) from auth_user u;You may need to put id<>1 if you already create the super user. It is better not to create the superuser when you do syncdb at the first time. You can do like this
- drop database hue
- create database hue
- build/env/bin/hue syncdb
- answer no when you are asked if creating a superuser
-
You just installed Django's auth system, which means you don't have any superusers defined. Would you like to create one now? (yes/no): no
- mysqldump -uxxx -pxxxx -h old_db --compact --no-create-info --disable-keys hue auth_group auth_user auth_user_groups auth_user_user_permissions > hue_user.sql
- mysql -uyyy -pyyyy -h new_db -D hue < hue_user.sql
- run the above insert queries
Friday, October 26, 2012
DB2 Client Install on Linux
- db2ls can show the current installation
- and the installed components
- db2idrop to remove, then db2_deinstall
[bewang@logs]$ db2ls Install Path Level Fix Pack Special Install Number Install Date Installer UID --------------------------------------------------------------------------------------------------------------------- /opt/ibm/db2/V9.7 9.7.0.2 2 Tue Oct 4 17:08:48 2011 PDT 0
[bewang@logs]$ db2ls -q -b /opt/ibm/db2/V9.7 Install Path : /opt/ibm/db2/V9.7 Feature Response File ID Level Fix Pack Feature Description --------------------------------------------------------------------------------------------------------------------- BASE_CLIENT 9.7.0.2 2 Base client support JAVA_SUPPORT 9.7.0.2 2 Java support LDAP_EXPLOITATION 9.7.0.2 2 DB2 LDAP support
Friday, October 12, 2012
Setup Tomcat on CentOS for Windows Authentication using SPNEGO
The big problem I faced was my company's network settings:
- There are two networks: corp.mycompany.com and lab.mycompany.com.
- lab trusts corp, but corp doesn't trust lab
Here is a question: where should you create the pre-auth account: in lab or corp?
I tried to create a service account in lab's AD, and registered SPNs in lab. It didn't work. When I accessed hello_spnego.jsp page on a Windows machine in corp, I always got the dialog asking for username and password. This is because I enabled downgrade to basic authentication for NTLM. If I disabled basic authentication, I would get 500 error.
I used wireshark to catch the packets and found out the traffic as bellow:
- Browser sends GET /hello_spnego.jsp
- Server returns 401 Unauthorized with Negotiate
- Client sends KRB5 TGS-REQ
- Client receives KRB5 KRB Error: KRB5KDC_ERR_S_PRINCIPAL_UNKNOWN
- Browser sends GET /hello_spnego.jsp HTTP/1.1, NTLMSSP_NEGOTIATE
HTTP/1.1 401 Unauthorized Server: Apache-Coyote/1.1\r\n WWW-Authenticate: Negotiate\r\n WWW-Authenticate: Basic realm="LAB.MYCOMPANY.COM"\r\n
Kerberos KRB-ERROR Pvno: 5 MSG Type: KRB-ERROR(30) stime: 2012-10-10 23:04:48 (UTC) susec: 394362 error_code: KRB5KDC_ERR_S_PRINCIPAL_UNKNOWN Realm: CORP.MYCOMPANY.COM Server Name (Service and Instance): HTTP/tomcat.lab.mycompany.com
Then I tried to use keytab method because I don't like to put username/password in plaintext in web.xml. There are still a log of pitfalls in this step. Here are the working version of my login.conf
spnego-client { com.sun.security.auth.module.Krb5LoginModule required; }; spnego-server { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="conf/appserver.keytab" principal="serviceaccount@CORP.MYCOMPANY.COM" storeKey=true isInitiator=false; };and krb5.conf.
[libdefaults] default_realm = LAB.MYCOMPANY.COM default_tgs_enctypes = arcfour-hmac-md5 des-cbc-crc des-cbc-md5 des3-hmac-sha1 default_tkt_enctypes = arcfour-hmac-md5 des-cbc-crc des-cbc-md5 des3-hmac-sha1 clockskew = 300 [realms] LAB.MYCOMPANY.COM = { kdc = kdc1.lab.mycompany.com kdc = kdc2.lab.mycompany.com default_domain = lab.mycompany.com } [default_realm] lab.mycompany.com = LAB.MYCOMPANY.COM .lab.mycompany.com = LAB.MYCOMPANY.COMYou may encounter different issues if something is wrong. Here is my experience:
- If I don't quote the principal like this principal=serviceaccount@CORP.MYCOMPANY.COM, I will get the configuration error. And the message is misleading because line 9 is keyTab.
- When you use ktab, the first thing you need to know is only windows version has this tool, while Linux RPM from oracle doesn't have it.
- You should use the service account in corp network, not lab, to generate the keytable file like this:
- Make sure your
- I also encountered this error: KrbException: Specified version of key is not available (44). It turns out that the keytab file I generated with kvno=1 and the expected is 2. You can use wireshark to catch the packet for KRB5 TGT-REP, and it will tell you what kvno is expected.
- You have to run ktab command multiple times to achieve the correct kvno just like this page http://dmdaa.wordpress.com/2010/05/08/how-to-get-needed-kvno-for-keytab-file-created-by-java-ktab-utility/. Use can just use ktab -l to find the kvno:
- Which version of JDK seems not important. A keytab file generated by JDK 7 worked in JDK 1.6.0_32.
- I also got this Checksum error if I used my lab service account (serviceaccount@lab.mycompany.com) in pre-auth fields or keytab.
Caused by: java.io.IOException: Configuration Error: Line 9: expected [option key], found [null]
ktab -a serviceaccount@CORP.MYCOMPANY.COM -k appserver.keytab
Ticket Tkt-vno: 5 Realm: LAB.MYCOMPANY.COM Server Name .... enc-part rc5-hmac Encryption type: ... Kvno: 2 *** Here it is enc-part: ...
ktab -l -k appserver.keytab
SEVERE: Servlet.service() for servlet [jsp] in context with path [] threw exception [GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)] with root cause java.security.GeneralSecurityException: Checksum failed at sun.security.krb5.internal.crypto.dk.ArcFourCrypto.decrypt(ArcFourCrypto.java:388) at sun.security.krb5.internal.crypto.ArcFourHmac.decrypt(ArcFourHmac.java:74)
Thursday, October 11, 2012
Hive Server 2 in CDH4.1
- Hive server 2 support LDAP and Kerberos, but only supports Simple bind. Unfortunately our LDAP server only supports SASL. If you get an error saying "Error validating the login", you may have the same issue I had. Just try to use ldapsearch in the command line to verify if you can access the ladp server. And you'd better take a look of /etc/ldap.conf if you use CentOS.
ldapsearch -Z -x "uid=bewang"
- Install hive-server2 from cloudera cdh4 yum repository, and sudo /sbin/service hive-server2 start
- Or hive --service hiveserver2
java.sql.SQLException: Method not supported at org.apache.hive.jdbc.HiveDatabaseMetaData.supportsMixedCaseIdentifiers(HiveDatabaseMetaData.java:922)
Sunday, September 9, 2012
Compile and install Thrift 0.8.0 on CentOS 5.8 and 6
I had troubles to compile thrift on CentOS 5.8 with ruby 1.9.3 and 1.8.7 in RVM.
- For 1.9.3, I ran "bundle install", but failed in compiling mongrel-1.1.5. mongrel-1.1.5 is not compatible with Ruby 1.9.3. Using mongrel-1.2.0pre2, it will be ok. You need to rerun "./configure" if you change the ruby version in RVM. And when you install, you have to pass your ANT_HOME and rvm_path if root doesn't have them.
vi thrift-0.8.0/lib/rb/thrift.gemspec s.add_development_dependency "mongrel", "1.2.0pre2" cd thrift-0.8.0/lib/rb bundle install cd ../../.. ./configure make sudo ANT_HOME=$ANT_HOME rvm_path=$rvm_path bash -c "source /home/bewang/.rvm/scripts/rvm && rvm use 1.9.3 && make install"
- For 1.8.7, I ran "bundle install" without any problem, but the compilation failed sometimes with this rspec error.
Spec::Mocks::MockExpectationError in 'Thrift::UNIXSocket should raise an error when read times out' < IO class> expected :select with (any args) once, but received it twice /home/bewang/temp/thrift-0.8.0/lib/rb/spec/socket_spec_shared.rb:94:
sudo yum install automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel tar zxvf thrift-0.8.0.tar.gz ./configure --without-ruby make sudo ANT_HOME=$ANT_HOME make install
Thursday, September 6, 2012
VirtualBox CentOS 5.8 guest DHCP add search
- In company, there are three networks mycompany.corp.com, mycompany.net and mycompany.dmz.com;
- At home, I want to access the Internet without VPN;
- Use VPN at home to access company's networks.
; generated by /sbin/dhclient-script search mycompany.corp.com nameserver 10.184.77.23 nameserver xx.10.217.47 nameserver 192.168.0.1
This is not convenient because I have to type the full DNS name if I want to access a host hostA.mycompany.net instead of hostA.
I can add the other twos in DNS tab network-configuration, but the change will be overwritten by dhclient when the interface is restarted, or machine reboot.
If set DNS name servers will be different for company and home. It works in one environment and fails in the others. Company internal DNS servers are accessible from outside without VPN.
Here is my solution to config dhclient to include the other two network in DNS search like this:
add /etc/dhclient-eth1.conf interface "eth1" { append domain-name " mycompany.net mycompany.dmz.com"; };Then restart the network:
sudo /sbin/service network restartNOTES:
- a blank in domain-name value is important because the string is simply concatenated to the string retrieved from the provider.
- if it is not working, check /var/log/message
- use "domain-name" for search. "domain-search" is not correct for CentOS 5.8.
Wednesday, August 1, 2012
Run multiple commands in db2cmd
db2cmd -c db2 -tvf cmd_file.sqlBut if you don't want to generate a file, here is how
db2cmd -c db2 connect to mydb user me using passwd && db2 select * from my_table && db2 terminateDOS uses && to run multiple commands in one line.
Monday, June 11, 2012
Fixing CentOS Root Certificate Authority Issues
Thursday, May 31, 2012
Issues after installing VirtualBox 4.1.16
- My Guest "Host-Only Network Interface" stopped working.
IP address for my guest CentOS 5.6 was 192.168.56.101, but CentOS said the address was already occupied. I found that there WERE processes in Resource Monitor using that address and listening on 137/138/139. It turns out that my "VirtualBox Host-Only Network adapter" was using 192.168.56.101.
Here is how you can access Resources Monitor:
- Right click task bar
- Start Task Manager
- Performance
- Resource Monitor
- Network
- Listening Ports
After installing/uninstalling VirutalBox several times, "VirtualBox Host-Only Network" adapter went back to 192.168.56.1.
- Then another weird thing happened: I lost all printers and couldn't even access any Window File server. After several tries, I found that Network feature "Client for Microsoft Networks" were gone for "Local Area Connection" and "Wireless Network Connection". I forget if I deleted it or it was deleted during upgrading VirtualBox. Installing "Client for Microsoft Networks" restored my printers and I can access all file servers.
Wednesday, May 30, 2012
Setup Git-P4 using puppet
class git ( $version = "1.7.4.1" ) { package { "git-$version": ensure => present } file { "/usr/local/bin/git-p4": source => "/usr/share/doc/git-$version/contrib/fast-import/git-p4", mode => 0755, require => Package["git-$version"] } }References:
Git-p4: create a new project in Perforce
- Create a new p4 client spec: bewang-my-app-ws with this map "//depot/my-app/... //bewang-my-app-ws/..."
- Create a p4 workspace directory: ~/p4-ws/my-app
- Create .p4config with one line "P4CLIENT=bewang-my-app-ws" under ~/p4-ws/my-app
- Create a new file in ~/p4-ws/my-app, and check it into perforce.
Although you can see a Perforce file in nested folders/directories, but there are not such objects (folders/directories) in perforce indeed. You cannot just create an empty folder and check it into Perforce.
This step creates the depot folder. Otherwise you may see this error when you run "git p4 rebase":
[bewang@pmaster my-app]$ git p4 rebase Traceback (most recent call last): File "/usr/local/bin/git-p4", line 1926, in ? main() File "/usr/local/bin/git-p4", line 1921, in main if not cmd.run(args): File "/usr/local/bin/git-p4", line 1716, in run sync.run([]) File "/usr/local/bin/git-p4", line 1673, in run changes = p4ChangesForPaths(self.depotPaths, self.changeRange) File "/usr/local/bin/git-p4", line 442, in p4ChangesForPaths assert depotPaths
- Create a folder for git repository: ~/git/my-app
- Copy ~/p4-ws/my-app/.p4config to ~/git/my-app/.p4config
- Run "git p4 clone //depot/my-app ."
- You should see the file you just checked in and .git folder under ~/git/my-app
- Work on your git repository
- Run "git p4 rebase" before you want to check-in into perforce
- Then run "git p4 submit" to submit each git commit to perforce
- I can totally work offline without Perforce server
Thursday, May 17, 2012
Implements CORS using Backbone.js and Rails3
Started GET "/components/1" for 127.0.0.1 at 2012-05-14 07:48:08 -0700 Processing by ComponentsController#show as HTML Parameters: {"id"=>"1"} Completed 406 Not Acceptable in 1424ms (ActiveRecord: 1283.8ms)The reason why I got this error is that my application is supposed to accept and respond JSON only. So my controller only responds a JSON result.
def show set_access_control_headers @components = ComponentModel.today_component_status(params[:id]).first respond_to do |format| format.json { render :json => @components } end endThis page is very helpful on stackoverflow: http://stackoverflow.com/questions/9241045/backbone-client-with-a-remote-rails-server
Thursday, April 26, 2012
rssh chroot jail setup on CentOS 6 64bit
- Install rssh
- Edit /etc/rssh.conf
- Create the user deployer
- Create .ssh folder for deployer and add public keys to /app/platform/.ssh/authorized_keys
- Create chroot jail
- Create /dev/null
- Create bin and copy /bin/sh
- Copy all files in /lib64 to /app/platform/chroot/lib64
- Create home directory in chroot jail
- Edit jailed {{etc/password}} to change the home directory of deployer
- Then you can successfully scp a file to the repository
sudo yum install rssh rssh.x86_64 2.3.3-2.el6.rf @rpmforge-el6-x86_64
# /etc/rssh.conf logfacility = LOG_USER allowscp #allowsftp #allowcvs #allowrdist #allowrsync umask = 022 chrootpath = /app/platform/chroot user=deployer:011:00001:/app/platform/chroot
sudo /bin/groupadd builder sudo /sbin/useradd -g builder -s /usr/bin/rssh -d /app/platform deployer
sudo mkdir /app/platform/.ssh sudo cp id_rsa.pub /app/platform/.ssh/authorized_keys
sudo /usr/share/doc/rssh-2.3.3/mkchroot.sh /app/platform/chroot/NOTE: Must have "/", otherwise the script will create a folder "/app/platform/chroot."
sudo mknod -m 666 dev/null c 1 3
sudo mkdir /app/platform/chroot/bin sudo cp /bin/sh /app/platform/chroot/bin
sudo cp /lib64/* /app/platform/chroot/lib64NOTE: Some of lib files must not be used, but I don't know which files should be kept.
sudo mkdir /app/platform/chroot/home/deployer sudo chown deployer:builder /app/platform/chroot/home/deployer
deployer:x:501:34075::/home/deployer:/usr/bin/rssh
scp a_file.xml deployer@chroot_host:/repositoryNOTE: Because we use chroot, the folder is /repository.
- "lost connection" Anything is wrong you will get this error.
- "Couldn't open /dev/null: No such file or directory" After copying /bin/sh, got this error. mknod resolved this issue.
- "unknown user 501" After copying all files in /lib64/, it is gone
- "Could not chdir to home directory : No such file or directory" chroot/home/deployer is not created
Monday, April 9, 2012
Extract File from RPM
[bewang@pmaster tmp]$ rpm2cpio hue-common-1.2.0.0+114.20-1.x86_64.rpm | cpio -idvm ./etc/hue/log4j.properties cpio: warning: skipped 10263 bytes of junk cpio: warning: archive header has reverse byte-order cpio: premature end of fileIt turns out that rpm2cpio outputs compressed data after googling this issue. I have to do like this:
rpm2cpio hue-common-1.2.0.0+114.20-1.x86_64.rpm | xz -d | cpio -idvm ./etc/hue/log4j.properties
Saturday, March 3, 2012
Use JDB to debug beeswax
12/02/29 12:39:14 INFO ppd.OpProcFactory: ((local_dt = '2011-11-01') and (local_dt < '2011-11-08')) 12/02/29 12:39:14 INFO ppd.OpProcFactory: Processing for TS(293) 12/02/29 12:39:14 INFO ppd.OpProcFactory: Pushdown Predicates of TS For Alias : lz.lz_omniture_hit 12/02/29 12:39:14 INFO ppd.OpProcFactory: ((local_dt = '2011-11-01') and (local_dt < '2011-11-08')) 12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_names : db=lz tbl=lz_omniture_hit 12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_names : db=lz tbl=lz_omniture_hit 12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-01-01,AT] 12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-01-01,AT] 12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AR] 12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AR] 12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AT] 12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AT] 12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AU] 12/02/29 12:39:14 INFO HiveMetaStore.audit: ugi=platdev ip=unknown-ip-addr cmd=get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,AU] 12/02/29 12:39:14 INFO metastore.HiveMetaStore: 27: get_partition_with_auth : db=lz tbl=lz_omniture_hit[2011-11-01,BE]Unfortunately, I cannot use Eclipse on the machine where I run beeswax server because X window is not installed. JDB is the only choice, but it really not a good tool for debugging. Here is how I use JDB to debug Beeswax server:
- Start Beeswax server in debug mode, add this in apps/beeswax/beeswax_server.sh before hadoop jar is called
export HADOOP_OPTS="-Dlog4j.configuration=log4j.properties -Xdebug -Xrunjdwp:transport=dt_socket,address=4444,server=y,suspend=n" ... echo Executing $HADOOP_HOME/bin/hadoop jar $BEESWAX_JAR "$@" exec $HADOOP_HOME/bin/hadoop jar $BEESWAX_JAR "$@"
- Start JDB using jdb -attach 4444
- Add source file paths
> use /etc/hive/conf:$HOME/git/hue/apps/beeswax/src/beeswax/../../../../desktop/conf:/usr/lib/hadoop/conf:/usr/lib/hadoop:$HOME/git/hue/apps/beeswax/java/src/main:$HOME/hadoop/hive-0.7.1-cdh3u1/src/ql:$HOME/hadoop/hive-0.7.1-cdh3u1/src/metastore:$HOME/hadoop/hive-0.7.1-cdh3u1/src/common
- List all threads: threads
Beeswax-8[1] threads Group system: (java.lang.ref.Reference$ReferenceHandler)0xec0 Reference Handler cond. waiting (java.lang.ref.Finalizer$FinalizerThread)0xebf Finalizer cond. waiting (java.lang.Thread)0xebe Signal Dispatcher running Group main: (java.lang.Thread)0xec1 main running (java.util.TimerThread)0xebd Timer thread for monitoring ugi cond. waiting (java.lang.Thread)0xebc MetaServerThread running (java.lang.Thread)0xebb Evicter sleeping (java.lang.Thread)0xeba pool-2-thread-1 running (org.apache.hadoop.util.Daemon)0xeb9 LeaseChecker sleeping (java.lang.Thread)0xeb8 pool-3-thread-1 running (org.apache.hadoop.util.Daemon)0xeb7 LeaseChecker sleeping (org.apache.hadoop.util.Daemon)0xeb6 LeaseChecker sleeping (org.apache.hadoop.util.Daemon)0xeb5 LeaseChecker sleeping (org.apache.hadoop.util.Daemon)0xed1 LeaseChecker sleeping (java.lang.Thread)0xef6 Beeswax-8 running (at breakpoint) (org.apache.hadoop.ipc.Client$Connection)0xef9 IPC Client (47) connection to chelhadedw002/10.184.39.97:8020 from platdev cond. waiting (java.lang.Thread)0xefa sendParams-5 cond. waiting
- Go to a thread: thread 0xebc
- Add breakpoints:
- Stop at the specified line:
stop at com.cloudera.beeswax.BeeswaxServiceImpl:689
- Stop in a method:
stop in com.cloudera.beeswax.BeeswaxServiceImpl.query
- Stop in a method of nested class:
stop in org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth
- Stop at the specified line:
MetaServerThread[1] stop in org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth Set breakpoint org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth MetaServerThread[1] cont > Breakpoint hit: "thread=Beeswax-8", org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(), line=1,463 bci=0 Beeswax-8[1] where [1] org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth (HiveMetaStore.java:1,463) [2] org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo (HiveMetaStoreClient.java:657) [3] org.apache.hadoop.hive.ql.metadata.Hive.getPartition (Hive.java:1,293) [4] org.apache.hadoop.hive.ql.metadata.Hive.getPartition (Hive.java:1,258) [5] org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune (PartitionPruner.java:229) [6] org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan (GenMapRedUtils.java:551) [7] org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan (GenMapRedUtils.java:514) [8] org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.initPlan (GenMapRedUtils.java:125) [9] org.apache.hadoop.hive.ql.optimizer.GenMRRedSink1.process (GenMRRedSink1.java:76) [10] org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch (DefaultRuleDispatcher.java:89) [11] org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch (DefaultGraphWalker.java:88)My Hive version is hadoop-hive-cdh3u1. The code of PartitionPruner is not good enough and it was replaced in hadoop-hive-cdh4b1.
Wednesday, February 8, 2012
Django south migration commands
- List all migrations by application
build/env/bin/hue migrate --list
- Create migration: create the migration file
build/env/bin/hue schemamigration beeswax --auto
- Migrate: create/alert tables in the database
build/env/bin/hue migrate beeswax
- Rollback: you just applied 0007, but want to roll back to 0006
build/env/bin/hue migrate beeswax 0006
Thursday, January 5, 2012
Debug Python Process
However, sometimes I need to debug a process that I didn't have the foresight to install the signal handler in. On linux, you can attach gdb to the process and get a python stack trace with some gdb macros. Put http://svn.python.org/projects/python/trunk/Misc/gdbinit in ~/.gdbinit, then: Attach gdb: gdb -p PID Get the python stack trace: pystack It's not totally reliable unfortunately, but it works most of the time.Here are the better ways to debug a python process:
- Use strace -ppid. If the process is running as another user, use sudo -u hue strace -ppid. hue is the user account for Cloudera Hue.
- Install python-debuginfo-2.4.3-44.el5 for Python 2.4 or python26-debuginfo-2.6.5-6.el5 for Python 2.6. And you can use this yum repository to get those packages, create /etc/yum.repos.d/debuginfo.repo :
[debuginfo] name=CentOS-$releasever - DebugInfo # CentOS-5 baseurl=http://debuginfo.centos.org/$releasever/$basearch/ gpgcheck=0 enabled=1 # CentOS-5 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5 protect=1 priority=1