Skip to content

Reading tsv file concurrently with multiple goroutines #38

@Wkalmar

Description

@Wkalmar

Hello,
I'm using your tsv package to read .tsv file.
The code below works fine

type row struct {
	Tconst         string `tsv:"tconst"`
	TitleType      string `tsv:"titleType"`
	PrimaryTitle   string `tsv:"primaryTitle"`
	OriginalTitle  string `tsv:"originalTitle"`
	IsAdult        byte   `tsv:"isAdult"`
	StartYear      uint16 `tsv:"startYear"`
	EndYear        string `tsv:"endYear"`
	RuntimeMinutes uint16 `tsv:"runtimeMinutes"`
	Genres         string `tsv:"genres"`
}

func ReadFilePlain() {
	file, err := os.Open("/static/data.tsv")
	if err != nil {
		panic(err)
	}
	defer file.Close()
	r := tsv.NewReader(file)
	r.HasHeaderRow = true
	r.UseHeaderNames = true
	for i := 0; i < 1000; i++ {
		var v row
		err = r.Read(&v)
		if err == nil {
			fmt.Printf("%+v\n", v)
		} else {
			fmt.Println(err)
		}
	}
}

However, when I try to speed things up a bit with using goroutines like this

func ReadFileGoRoutines() {
	file, err := os.Open("/static/data.tsv")
	if err != nil {
		panic(err)
	}
	defer file.Close()
	r := tsv.NewReader(file)
	r.HasHeaderRow = true
	r.UseHeaderNames = true
	var wg sync.WaitGroup
	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func() {
			var v row
			err = r.Read(&v)
			if err == nil {
				fmt.Printf("%+v\n", v)
			} else {
				fmt.Println(err)
			}
			wg.Done()
		}()
	}
	wg.Wait()
}

I get

column tconst does not appear in the header: map[0:4 1:7 1894:5 Carmencita:3 Documentary,Short:8 \N:6 short:1 tt0000001:0]
panic: runtime error: slice bounds out of range [60:42]

Is it me doing something non-idiomatic or is this some concurrency issue?
For your convenience, I have the complete code here

Thank you in advance
Bohdan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions