」工欲善其事,必先利其器。「—孔子《論語.錄靈公》
首頁 > 程式設計 > 棘手的 Golang 面試問題 - 部分數據競賽

棘手的 Golang 面試問題 - 部分數據競賽

發佈於2024-11-09
瀏覽:898

Tricky Golang interview questions - Part Data Race

Here is another code review interview question for you. This question is more advanced than the previous ones and is targeted toward a more senior audience. The problem requires knowledge of slices and sharing data between parallel processes.

If you're not familiar with the slices and how they are constructed, please check out my previous article about the Slice Header

What is a Data Race?

A data race occurs when two or more threads (or goroutines, in the case of Go) concurrently access shared memory, and at least one of those accesses is a write operation. If there are no proper synchronization mechanisms (such as locks or channels) in place to manage access, the result can be unpredictable behavior, including corruption of data, inconsistent states, or crashes.

In essence, a data race happens when:

  • Two or more threads (or goroutines) access the same memory location at the same time.
  • At least one of the threads (or goroutines) is writing to that memory.
  • There is no synchronization to control the access to that memory.

Because of this, the order in which the threads or goroutines access or modify the shared memory is unpredictable, leading to non-deterministic behavior that can vary between runs.

      ----------------------        --------------------- 
     | Thread A: Write      |      | Thread B: Read      |
      ----------------------        --------------------- 
     | 1. Reads x           |      | 1. Reads x          |
     | 2. Adds 1 to x       |      |                     |
     | 3. Writes new value  |      |                     |
      ----------------------        --------------------- 

                    Shared variable x
                    (Concurrent access without synchronization)

Here, Thread A is modifying x (writing to it), while Thread B is reading it at the same time. If both threads are running concurrently and there’s no synchronization, Thread B could read x before Thread A has finished updating it. As a result, the data could be incorrect or inconsistent.

Question: One of your teammates submitted the following code for a code review. Please review the code carefully and identify any potential issues.
And here the code that you have to review:

package main  

import (  
    "bufio"  
    "bytes"
    "io"
    "math/rand"
    "time"
)  

func genData() []byte {  
    r := rand.New(rand.NewSource(time.Now().Unix()))  
    buffer := make([]byte, 512)  
    if _, err := r.Read(buffer); err != nil {  
       return nil  
    }  
    return buffer  
}  

func publish(input []byte, output chan



What we have here?

The publish() function is responsible for reading the input data chunk by chunk and sending each chunk to the output channel. It begins by using bytes.NewReader(input) to create a reader from the input data, which allows the data to be read sequentially. A buffer of size 8 is created to hold each chunk of data as it’s being read from the input. During each iteration, reader.Read(buffer) reads up to 8 bytes from the input, and the function then sends a slice of this buffer (buffer[:n]) containing up to 8 bytes to the output channel. The loop continues until reader.Read(buffer) either encounters an error or reaches the end of the input data.

The consume() function handles the data chunks received from the channel. It processes these chunks using a bufio.Scanner, which scans each chunk of data, potentially breaking it into lines or tokens depending on how it’s configured. The variable b := scanner.Bytes() retrieves the current token being scanned. This function represents a basic input processing.

The main() creates a buffered channel chunkChannel with a capacity equal to workersCount, which is set to 4 in this case. The function then launches 4 worker goroutines, each of which will read data from the chunkChannel concurrently. Every time a worker receives a chunk of data, it processes the chunk by calling the consume() function. The publish() function reads the generated data, breaks it into chunks of up to 8 bytes, and sends them to the channel.

The program uses goroutines to create multiple consumers, allowing for concurrent data processing. Each consumer runs in a separate goroutine, processing chunks of data independently.

If you run this code, noting suspicious will happen:

[Running] go run "main.go"

[Done] exited with code=0 in 0.94 seconds

But there is a problem. We have a Data Race Risk. In this code, there’s a potential data race because the publish() function reuses the same buffer slice for each chunk. The consumers are reading from this buffer concurrently, and since slices share underlying memory, multiple consumers could be reading the same memory, leading to a data race. Let's try to use a race detection. Go provides a built-in tool to detect data races: the race detector. You can enable it by running your program with the -race flag:

go run -race main.go

If we add the -race flag to the run command we will receive the following output:

[Running] go run -race "main.go"

==================
WARNING: DATA RACE
Read at 0x00c00011e018 by goroutine 6:
  runtime.slicecopy()
      /GOROOT/go1.22.0/src/runtime/slice.go:325  0x0
  bytes.(*Reader).Read()
      /GOROOT/go1.22.0/src/bytes/reader.go:44  0xcc
  bufio.(*Scanner).Scan()
      /GOROOT/go1.22.0/src/bufio/scan.go:219  0xef4
  main.consume()
      /GOPATH/example/main.go:40  0x140
  main.main.func1()
      /GOPATH/example/main.go:55  0x48

Previous write at 0x00c00011e018 by main goroutine:
  runtime.slicecopy()
      /GOROOT/go1.22.0/src/runtime/slice.go:325  0x0
  bytes.(*Reader).Read()
      /GOROOT/go1.22.0/src/bytes/reader.go:44  0x168
  main.publish()
      /GOPATH/example/main.go:27  0xe4
  main.main()
      /GOPATH/example/main.go:60  0xdc

Goroutine 6 (running) created at:
  main.main()
      /GOPATH/example/main.go:53  0x50
==================
Found 1 data race(s)
exit status 66

[Done] exited with code=0 in 0.94 seconds

The warning you’re seeing is a classic data race detected by Go’s race detector. The warning message indicates that two goroutines are accessing the same memory location (0x00c00011e018) concurrently. One goroutine is reading from this memory, while another goroutine is writing to it at the same time, without proper synchronization.

The first part of the warning tells us that Goroutine 6 (which is one of the worker goroutines in your program) is reading from the memory address 0x00c00011e018 during a call to bufio.Scanner.Scan() inside the consume() function.

Read at 0x00c00011e018 by goroutine 6:
  runtime.slicecopy()
  /GOROOT/go1.22.0/src/runtime/slice.go:325  0x0
  bytes.(*Reader).Read()
  /GOROOT/go1.22.0/src/bytes/reader.go:44  0xcc
  bufio.(*Scanner).Scan()
  /GOROOT/go1.22.0/src/bufio/scan.go:219  0xef4
  main.consume()
  /GOPATH/example/main.go:40  0x140
  main.main.func1()
  /GOPATH/example/main.go:55  0x48

The second part of the warning shows that the main goroutine previously wrote to the same memory location (0x00c00011e018) during a call to bytes.Reader.Read() inside the publish() function.

Previous write at 0x00c00011e018 by main goroutine:
  runtime.slicecopy()
  /GOROOT/go1.22.0/src/runtime/slice.go:325  0x0
  bytes.(*Reader).Read()
  /GOROOT/go1.22.0/src/bytes/reader.go:44  0x168
  main.publish()
  /GOPATH/example/main.go:27  0xe4
  main.main()
  /GOPATH/example/main.go:60  0xdc

The final part of the warning explains that Goroutine 6 was created in the main function.

Goroutine 6 (running) created at:
  main.main()
  /GOPATH/example/main.go:53  0x50

In this case, while one goroutine (Goroutine 6) is reading from the buffer in consume(), the publish() function in the main goroutine is simultaneously writing to the same buffer, leading to the data race.

 -------------------                 -------------------- 
|     Publisher     |               |      Consumer      |
 -------------------                 -------------------- 
        |                                   |
        v                                   |
1. Read data into buffer                    |
        |                                   |
        v                                   |
2. Send slice of buffer to chunkChannel     |
        |                                   |
        v                                   |
  ----------------                          |
 |  chunkChannel  |                         |
  ----------------                          |
        |                                   |
        v                                   |
3. Consume reads from slice                 |
                                            v
                                    4. Concurrent access
                                    (Data Race occurs)

Why the Data Race Occurs

The data race in this code arises because of how Go slices work and how memory is shared between goroutines when a slice is reused. To fully understand this, let’s break it down into two parts: the behavior of the buffer slice and the mechanics of how the race occurs. When you pass a slice like buffer[:n] to a function or channel, what you are really passing is the slice header which contains a reference to the slice’s underlying array. Any modifications to the slice or the underlying array will affect all other references to that slice.

buffer = [ a, b, c, d, e, f, g, h ]  





func publish(input []byte, output chan



If you send buffer[:n] to a channel, both the publish() function and any consumer goroutines will be accessing the same memory. During each iteration, the reader.Read(buffer) function reads up to 8 bytes from the input data into this buffer slice. After reading, the publisher sends buffer[:n] to the output channel, where n is the number of bytes read in the current iteration.

The problem here is that buffer is reused across iterations. Every time reader.Read() is called, it overwrites the data stored in buffer.

  • Iteration 1: publish() function reads the first 8 bytes into buffer and sends buffer[:n] (say, [a, b, c, d, e, f, g, h]) to the channel.
  • Iteration 2: The publish() function overwrites the buffer with the next 8 bytes, let’s say [i, j, k, l, m, n, o, p], and sends buffer[:n] again.

At this point, if one of the worker goroutines is still processing the first chunk, it is now reading stale or corrupted data because the buffer has been overwritten by the second chunk. Reusing a slice neans sharing the same memory.

How to fix the Data Race?

To avoid the race condition, we must ensure that each chunk of data sent to the channel has its own independent memory. This can be achieved by creating a new slice for each chunk and copying the data from the buffer to this new slice. The key fix is to copy the contents of the buffer into a new slice before sending it to the chunkChannel:

chunk := make([]byte, n)    // Step 1: Create a new slice with its own memory
copy(chunk, buffer[:n])     // Step 2: Copy data from buffer to the new slice
output 



Why this fix works? By creating a new slice (chunk) for each iteration, you ensure that each chunk has its own memory. This prevents the consumers from reading from the buffer that the publisher is still modifying. copy() function copies the contents of the buffer into the newly allocated slice (chunk). This decouples the memory used by each chunk from the buffer. Now, when the publisher reads new data into the buffer, it doesn’t affect the chunks that have already been sent to the channel.

 -------------------------             ------------------------ 
|  Publisher (New Memory) |           | Consumers (Read Copy)  |
|  [ a, b, c ] --> chunk1 |           |  Reading: chunk1       |
|  [ d, e, f ] --> chunk2 |           |  Reading: chunk2       |
 -------------------------             ------------------------ 
         ↑                                    ↑
        (1)                                  (2)
   Publisher Creates New Chunk          Consumers Read Safely

This solution works is that it breaks the connection between the publisher and the consumers by eliminating shared memory. Each consumer now works on its own copy of the data, which the publisher does not modify. Here’s how the modified publish() function looks:

func publish(input []byte, output chan



Summary

Slices Are Reference Types:
As mentioned earlier, Go slices are reference types, meaning they point to an underlying array. When you pass a slice to a channel or a function, you’re passing a reference to that array, not the data itself. This is why reusing a slice leads to a data race: multiple goroutines end up referencing and modifying the same memory.

Memory Allocation:
When we create a new slice with make([]byte, n), Go allocates a separate block of memory for that slice. This means the new slice (chunk) has its own backing array, independent of the buffer. By copying the data from buffer[:n] into chunk, we ensure that each chunk has its own private memory space.

Decoupling Memory:
By decoupling the memory of each chunk from the buffer, the publisher can continue to read new data into the buffer without affecting the chunks that have already been sent to the channel. Each chunk now has its own independent copy of the data, so the consumers can process the chunks without interference from the publisher.

Preventing Data Races:
The main source of the data race was the concurrent access to the shared buffer. By creating new slices and copying the data, we eliminate the shared memory, and each goroutine operates on its own data. This removes the possibility of a race condition because there’s no longer any contention over the same memory.

Conclusion

The core of the fix is simple but powerful: by ensuring that each chunk of data has its own memory, we eliminate the shared resource (the buffer) that was causing the data race. This is achieved by copying the data from the buffer into a new slice before sending it to the channel. With this approach, each consumer works on its own copy of the data, independent of the publisher’s actions, ensuring safe concurrent processing without race conditions. This method of decoupling shared memory is a fundamental strategy in concurrent programming. It prevents the unpredictable behavior caused by race conditions and ensures that your Go programs remain safe, predictable, and correct, even when multiple goroutines are accessing data concurrently.

It's that easy!

版本聲明 本文轉載於:https://dev.to/crusty0gphr/tricky-golang-interview-questions-part-7-data-race-753?1如有侵犯,請聯絡[email protected]刪除
最新教學 更多>
  • 如何使用不同數量列的聯合數據庫表?
    如何使用不同數量列的聯合數據庫表?
    合併列數不同的表 當嘗試合併列數不同的數據庫表時,可能會遇到挑戰。一種直接的方法是在列數較少的表中,為缺失的列追加空值。 例如,考慮兩個表,表 A 和表 B,其中表 A 的列數多於表 B。為了合併這些表,同時處理表 B 中缺失的列,請按照以下步驟操作: 確定表 B 中缺失的列,並將它們添加到表的...
    程式設計 發佈於2025-07-02
  • 為什麼不使用CSS`content'屬性顯示圖像?
    為什麼不使用CSS`content'屬性顯示圖像?
    在Firefox extemers屬性為某些圖像很大,&& && && &&華倍華倍[華氏華倍華氏度]很少見,卻是某些瀏覽屬性很少,尤其是特定於Firefox的某些瀏覽器未能在使用內容屬性引用時未能顯示圖像的情況。這可以在提供的CSS類中看到:。 googlepic { 內容:url(&...
    程式設計 發佈於2025-07-02
  • 人臉檢測失敗原因及解決方案:Error -215
    人臉檢測失敗原因及解決方案:Error -215
    錯誤處理:解決“ error:((-215)!empty()in Function Multultiscale中的“ openCV 要解決此問題,必須確保提供給HAAR CASCADE XML文件的路徑有效。在提供的代碼片段中,級聯分類器裝有硬編碼路徑,這可能對您的系統不准確。相反,OPENCV提...
    程式設計 發佈於2025-07-02
  • C++成員函數指針正確傳遞方法
    C++成員函數指針正確傳遞方法
    如何將成員函數置於c 的函數時,接受成員函數指針的函數時,必須同時提供對象的指針,並提供指針和指針到函數。需要具有一定簽名的功能指針。要通過成員函數,您需要同時提供對象指針(此)和成員函數指針。這可以通過修改Menubutton :: SetButton()(如下所示:[&& && && &&華)...
    程式設計 發佈於2025-07-02
  • 如何在Java中正確顯示“ DD/MM/YYYY HH:MM:SS.SS”格式的當前日期和時間?
    如何在Java中正確顯示“ DD/MM/YYYY HH:MM:SS.SS”格式的當前日期和時間?
    如何在“ dd/mm/yyyy hh:mm:mm:ss.ss”格式“ gormat 解決方案: args)拋出異常{ 日曆cal = calendar.getInstance(); SimpleDateFormat SDF =新的SimpleDateFormat(“...
    程式設計 發佈於2025-07-02
  • 編譯器報錯“usr/bin/ld: cannot find -l”解決方法
    編譯器報錯“usr/bin/ld: cannot find -l”解決方法
    錯誤:“ usr/bin/ld:找不到-l “ 此錯誤表明鏈接器在鏈接您的可執行文件時無法找到指定的庫。為了解決此問題,我們將深入研究如何指定庫路徑並將鏈接引導到正確位置的詳細信息。 添加庫搜索路徑的一個可能的原因是,此錯誤是您的makefile中缺少庫搜索路徑。要解決它,您可以在鏈接器命令中添...
    程式設計 發佈於2025-07-02
  • 為什麼PHP的DateTime :: Modify('+1個月')會產生意外的結果?
    為什麼PHP的DateTime :: Modify('+1個月')會產生意外的結果?
    使用php dateTime修改月份:發現預期的行為在使用PHP的DateTime類時,添加或減去幾個月可能並不總是會產生預期的結果。正如文檔所警告的那樣,“當心”這些操作的“不像看起來那樣直觀。 考慮文檔中給出的示例:這是內部發生的事情: 現在在3月3日添加另一個月,因為2月在2001年只有2...
    程式設計 發佈於2025-07-02
  • 如何有效地轉換PHP中的時區?
    如何有效地轉換PHP中的時區?
    在PHP 利用dateTime對象和functions DateTime對象及其相應的功能別名為時區轉換提供方便的方法。例如: //定義用戶的時區 date_default_timezone_set('歐洲/倫敦'); //創建DateTime對象 $ dateTime = ne...
    程式設計 發佈於2025-07-02
  • 如何在其容器中為DIV創建平滑的左右CSS動畫?
    如何在其容器中為DIV創建平滑的左右CSS動畫?
    通用CSS動畫,用於左右運動 ,我們將探索創建一個通用的CSS動畫,以向左和右移動DIV,從而到達其容器的邊緣。該動畫可以應用於具有絕對定位的任何div,無論其未知長度如何。 問題:使用左直接導致瞬時消失 更加流暢的解決方案:混合轉換和左 [並實現平穩的,線性的運動,我們介紹了線性的轉換。...
    程式設計 發佈於2025-07-02
  • 如何在鼠標單擊時編程選擇DIV中的所有文本?
    如何在鼠標單擊時編程選擇DIV中的所有文本?
    在鼠標上選擇div文本單擊帶有文本內容,用戶如何使用單個鼠標單擊單擊div中的整個文本?這允許用戶輕鬆拖放所選的文本或直接複製它。 在單個鼠標上單擊的div元素中選擇文本,您可以使用以下Javascript函數: function selecttext(canduterid){ if(d...
    程式設計 發佈於2025-07-02
  • 左連接為何在右表WHERE子句過濾時像內連接?
    左連接為何在右表WHERE子句過濾時像內連接?
    左JOIN CONUNDRUM:WITCHING小時在數據庫Wizard的領域中變成內在的加入很有趣,當將c.foobar條件放置在上面的Where子句中時,據說左聯接似乎會轉換為內部連接。僅當滿足A.Foo和C.Foobar標準時,才會返回結果。 為什麼要變形?關鍵在於其中的子句。當左聯接的右側...
    程式設計 發佈於2025-07-02
  • 如何檢查對像是否具有Python中的特定屬性?
    如何檢查對像是否具有Python中的特定屬性?
    方法來確定對象屬性存在尋求一種方法來驗證對像中特定屬性的存在。考慮以下示例,其中嘗試訪問不確定屬性會引起錯誤: >>> a = someClass() >>> A.property Trackback(最近的最新電話): 文件“ ”,第1行, AttributeError: SomeClass...
    程式設計 發佈於2025-07-02
  • 在程序退出之前,我需要在C ++中明確刪除堆的堆分配嗎?
    在程序退出之前,我需要在C ++中明確刪除堆的堆分配嗎?
    在C中的顯式刪除 在C中的動態內存分配時,開發人員通常會想知道是否有必要在heap-procal extrable exit exit上進行手動調用“ delete”操作員,但開發人員通常會想知道是否需要手動調用“ delete”操作員。本文深入研究了這個主題。 在C主函數中,使用了動態分配變量(...
    程式設計 發佈於2025-07-02
  • Java字符串非空且非null的有效檢查方法
    Java字符串非空且非null的有效檢查方法
    檢查字符串是否不是null而不是空的 if(str!= null && str.isementy())二手: if(str!= null && str.length()== 0) option 3:trim()。 isement(Isement() trim whitespace whites...
    程式設計 發佈於2025-07-02
  • Java數組中元素位置查找技巧
    Java數組中元素位置查找技巧
    在Java數組中檢索元素的位置 利用Java的反射API將數組轉換為列表中,允許您使用indexof方法。 (primitives)(鏈接到Mishax的解決方案) 用於排序陣列的數組此方法此方法返回元素的索引,如果發現了元素的索引,或一個負值,指示應放置元素的插入點。
    程式設計 發佈於2025-07-02

免責聲明: 提供的所有資源部分來自互聯網,如果有侵犯您的版權或其他權益,請說明詳細緣由並提供版權或權益證明然後發到郵箱:[email protected] 我們會在第一時間內為您處理。

Copyright© 2022 湘ICP备2022001581号-3