<?xml version="1.0" encoding="UTF-8"?>
<tickets type="array">
  <ticket>
    <assigned-to-id nil="true"></assigned-to-id>
    <completed-date type="datetime" nil="true"></completed-date>
    <component-id type="integer" nil="true"></component-id>
    <created-on type="datetime">2009-12-03T18:25:29+00:00</created-on>
    <description>It appears that dumbo cat /hdfs/path/part* does not actually concatenate all of the parts in an HDFS directory -- instead, it silently emits only the key-value pairs from the first part.

Since the normal Dumbo syntax without the final star chokes on the _logs directory that Hadoop creates by default, people may be using this part* syntax frequently, and they may not realize that it yields incorrect results.

Current workarounds include using dumbo cat without the star by manually deleting the _logs directory or configuring Hadoop not to create it. It may be more convenient to use the HDFS ls command to iterate through the part files in a directory explicitly to ensure that each one is processed as expected.</description>
    <from-support type="boolean">false</from-support>
    <id type="integer">717835</id>
    <importance type="integer">0</importance>
    <is-story type="boolean">false</is-story>
    <milestone-id type="integer" nil="true"></milestone-id>
    <notification-list nil="true"></notification-list>
    <number type="integer">66</number>
    <priority type="integer">2</priority>
    <reporter-id>ceBprigzar261JaaeP0Qfc</reporter-id>
    <space-id>cLQIPQdLOr3QNweJe5aVNr</space-id>
    <status type="integer">0</status>
    <story-importance type="integer">0</story-importance>
    <summary>dumbo cat /hdfs/path/part* silently fails to concatenate all part files</summary>
    <updated-at type="datetime">2009-12-03T18:25:29+00:00</updated-at>
    <working-hours type="float" nil="true"></working-hours>
    <working-hour nil="true"></working-hour>
  </ticket>
  <ticket>
    <assigned-to-id>bZobO4dLOr3Qu8eJe5aVNr</assigned-to-id>
    <completed-date type="datetime" nil="true"></completed-date>
    <component-id type="integer" nil="true"></component-id>
    <created-on type="datetime">2009-03-11T16:49:12+00:00</created-on>
    <description>Dumbo should provide a @-root /some/path@ option that makes all @-input another/path@ options relative to the specified root. Moreover, it could automatically add path-dependant options like @-inputformat@ when such a root is given, getting the &lt;code&gt;root-&gt;option&lt;/code&gt; mappings from @/etc/dumbo.conf@ or @~/.dumborc@.</description>
    <from-support type="boolean">false</from-support>
    <id type="integer">302576</id>
    <importance type="integer">0</importance>
    <is-story type="boolean">false</is-story>
    <milestone-id type="integer" nil="true"></milestone-id>
    <notification-list nil="true"></notification-list>
    <number type="integer">1</number>
    <priority type="integer">3</priority>
    <reporter-id>bZobO4dLOr3Qu8eJe5aVNr</reporter-id>
    <space-id>cLQIPQdLOr3QNweJe5aVNr</space-id>
    <status type="integer">0</status>
    <story-importance type="integer">0</story-importance>
    <summary>Roots and mappings</summary>
    <updated-at type="datetime">2009-03-11T16:49:12+00:00</updated-at>
    <working-hours type="float" nil="true"></working-hours>
    <working-hour nil="true"></working-hour>
    <CustomFields>
      <CustomField type="List" name="Type" id="4148"></CustomField>
    </CustomFields>
  </ticket>
  <ticket>
    <assigned-to-id nil="true"></assigned-to-id>
    <completed-date type="datetime" nil="true"></completed-date>
    <component-id type="integer">3</component-id>
    <created-on type="datetime">2009-06-19T14:11:05+00:00</created-on>
    <description>Dumbo has a streamoutput option that, if set, overrides stream.reduce.out (or stream.map.output if there is no reducer). It would be neat to be able to specify both the streamoutput option and stream.[map|reduce].output.</description>
    <from-support type="boolean">false</from-support>
    <id type="integer">410588</id>
    <importance type="integer">0</importance>
    <is-story type="boolean">false</is-story>
    <milestone-id type="integer" nil="true"></milestone-id>
    <notification-list nil="true"></notification-list>
    <number type="integer">49</number>
    <priority type="integer">3</priority>
    <reporter-id>b7cRxogf0r3Os9eJe5afGb</reporter-id>
    <space-id>cLQIPQdLOr3QNweJe5aVNr</space-id>
    <status type="integer">0</status>
    <story-importance type="integer">0</story-importance>
    <summary>Add option to force read/write of typedbytes</summary>
    <updated-at type="datetime">2009-06-19T14:11:05+00:00</updated-at>
    <working-hours type="float" nil="true"></working-hours>
    <working-hour nil="true"></working-hour>
    <CustomFields>
      <CustomField type="List" name="Type" id="4148">enhancement</CustomField>
    </CustomFields>
  </ticket>
  <ticket>
    <assigned-to-id nil="true"></assigned-to-id>
    <completed-date type="datetime" nil="true"></completed-date>
    <component-id type="integer" nil="true"></component-id>
    <created-on type="datetime">2009-06-26T14:46:10+00:00</created-on>
    <description>Since starters can parse command line options with delopt, it'd be useful to be able to specify a job on the command line and setup the appropriate runner, or have all jobs run sequentially, optionally with different output files.

This would allow you to define within one file, several map/reduce methods in subclasses of some shared class that defines the shared functionality (e.g. parsing, reducer)</description>
    <from-support type="boolean">false</from-support>
    <id type="integer">418699</id>
    <importance type="integer">0</importance>
    <is-story type="boolean">false</is-story>
    <milestone-id type="integer" nil="true"></milestone-id>
    <notification-list nil="true"></notification-list>
    <number type="integer">51</number>
    <priority type="integer">3</priority>
    <reporter-id>cpprOYyL4r3ONCeJe5aVNr</reporter-id>
    <space-id>cLQIPQdLOr3QNweJe5aVNr</space-id>
    <status type="integer">0</status>
    <story-importance type="integer">0</story-importance>
    <summary>Allow running multiple jobs from the same original input from one script</summary>
    <updated-at type="datetime">2009-06-26T14:46:10+00:00</updated-at>
    <working-hours type="float" nil="true"></working-hours>
    <working-hour nil="true"></working-hour>
    <CustomFields>
      <CustomField type="List" name="Type" id="4148">enhancement</CustomField>
    </CustomFields>
  </ticket>
  <ticket>
    <assigned-to-id>bZobO4dLOr3Qu8eJe5aVNr</assigned-to-id>
    <completed-date type="datetime" nil="true"></completed-date>
    <component-id type="integer" nil="true"></component-id>
    <created-on type="datetime">2009-09-15T11:02:30+00:00</created-on>
    <description>When @dumbo start ./someprog.py ...@ is executed and the file @someprog.py@ doesn't exist, Dumbo will return the error &quot;relative module names not supported&quot; which isn't very helpful.</description>
    <from-support type="boolean">false</from-support>
    <id type="integer">517763</id>
    <importance type="integer">0</importance>
    <is-story type="boolean">false</is-story>
    <milestone-id type="integer" nil="true"></milestone-id>
    <notification-list nil="true"></notification-list>
    <number type="integer">60</number>
    <priority type="integer">3</priority>
    <reporter-id>bZobO4dLOr3Qu8eJe5aVNr</reporter-id>
    <space-id>cLQIPQdLOr3QNweJe5aVNr</space-id>
    <status type="integer">0</status>
    <story-importance type="integer">0</story-importance>
    <summary>cryptic &quot;relative module names not supported&quot; error message</summary>
    <updated-at type="datetime">2009-09-15T11:02:30+00:00</updated-at>
    <working-hours type="float" nil="true"></working-hours>
    <working-hour nil="true"></working-hour>
  </ticket>
  <ticket>
    <assigned-to-id nil="true"></assigned-to-id>
    <completed-date type="datetime" nil="true"></completed-date>
    <component-id type="integer" nil="true"></component-id>
    <created-on type="datetime">2009-10-01T11:12:49+00:00</created-on>
    <description>The following scripts demonstrate a failure to fail when executed on a hadoop cluster (fails fine if executed locally). 

The test uses dumbo.sumsreducer where dumbo.sumreduce should be used. A TypeError should be thrown by dumbo.sumsreducer (in the combiner). Instead the Hadoop reports show no error and zero output from the mapper.

Run the test: &quot;python run.py&quot;</description>
    <from-support type="boolean">false</from-support>
    <id type="integer">539934</id>
    <importance type="integer">0</importance>
    <is-story type="boolean">false</is-story>
    <milestone-id type="integer" nil="true"></milestone-id>
    <notification-list nil="true"></notification-list>
    <number type="integer">61</number>
    <priority type="integer">3</priority>
    <reporter-id>d1oDK2eIKr3OVqeJe5aVNr</reporter-id>
    <space-id>cLQIPQdLOr3QNweJe5aVNr</space-id>
    <status type="integer">0</status>
    <story-importance type="integer">0</story-importance>
    <summary>Failure to fail</summary>
    <updated-at type="datetime">2009-10-01T11:12:49+00:00</updated-at>
    <working-hours type="float" nil="true"></working-hours>
    <working-hour nil="true"></working-hour>
  </ticket>
  <ticket>
    <assigned-to-id>bZobO4dLOr3Qu8eJe5aVNr</assigned-to-id>
    <completed-date type="datetime" nil="true"></completed-date>
    <component-id type="integer" nil="true"></component-id>
    <created-on type="datetime">2010-01-13T16:46:27+00:00</created-on>
    <description>It would be useful if a multimapper would inherit all the opt decorations of the mappers that get added to it.</description>
    <from-support type="boolean">false</from-support>
    <id type="integer">832499</id>
    <importance type="integer">0</importance>
    <is-story type="boolean">false</is-story>
    <milestone-id type="integer" nil="true"></milestone-id>
    <notification-list nil="true"></notification-list>
    <number type="integer">68</number>
    <priority type="integer">3</priority>
    <reporter-id>bZobO4dLOr3Qu8eJe5aVNr</reporter-id>
    <space-id>cLQIPQdLOr3QNweJe5aVNr</space-id>
    <status type="integer">0</status>
    <story-importance type="integer">0</story-importance>
    <summary>make MultiMapper inherit options</summary>
    <updated-at type="datetime">2010-01-13T16:46:27+00:00</updated-at>
    <working-hours type="float" nil="true"></working-hours>
    <working-hour nil="true"></working-hour>
  </ticket>
</tickets>
