摘要

利用Valgrind检测自定义类中内存分配和释放情况. 报告结果依赖于编译器.

背景

最近考虑重构GAP代码, 于是学习了有关Fortran面向对象编程的知识, 接触到了设计模式(Design Pattern)的概念. 其中使用自定义类和用委派关系实现继承是自己之前很少在Fortran中用的, 主要还是面向过程的编程思维. 事实上面向对象的思维也是在研究生后学Python过程中慢慢转过去的. 有关设计模式的学习内容以后有机会再整理上来.

这篇文章算是记录一点点在Fortran中进行面向对象编程的实践, 主要用的是main.f90mytypes.f90这两段非常短的代码.

  • mytypes.f90包含一个模块, 其中定义了myarrays类, 其数据包含两个可分配数组, 分别是一维整型数组和二维浮点数数组, 并定义了相关constructor和destructor例程.
  • main.f90是主程序, 仅调用constructor和destructor方法, 因此原则上没有内存泄漏.

接下来就是用Valgrind作内存检测, 看一看. 编译用的Makefile在这里, 编译得到的可执行程序是test. 测试平台是Fedora 27.

依赖编译器的Valgrind报告

gfortran编译

使用gfortran (GCC 7.3.1)编译得到的test, Valgrind检测没有报错, 但堆调用中的alloc数为23, 比new_my_array例程中allocate语句(2)要多很多.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ valgrind --leak-check=full --show-leak-kinds=all ./test
==10854== Memcheck, a memory error detector
==10854== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==10854== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==10854== Command: ./test
==10854==
==10854==
==10854== HEAP SUMMARY:
==10854== in use at exit: 0 bytes in 0 blocks
==10854== total heap usage: 23 allocs, 23 frees, 13,520 bytes allocated
==10854==
==10854== All heap blocks were freed -- no leaks are possible
==10854==
==10854== For counts of detected and suppressed errors, rerun with: -v
==10854== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Intel Fortran编译

用Intel Fortran (2018 update 1)编译, 堆调用中的alloc数为4, 虽然也大于2但比gfortran里的23要小. 此外, Valgrind报告了32 bytes的”still reachable”泄漏, 这一泄漏和该版本Fedora中glibc的bug有关. 没有报错.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$ valgrind --leak-check=full --show-leak-kinds=all ./test
==13583== Memcheck, a memory error detector
==13583== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==13583== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==13583== Command: ./test
==13583==
==13583==
==13583== HEAP SUMMARY:
==13583== in use at exit: 32 bytes in 1 blocks
==13583== total heap usage: 4 allocs, 3 frees, 152 bytes allocated
==13583==
==13583== 32 bytes in 1 blocks are still reachable in loss record 1 of 1
==13583== at 0x4C2F01A: calloc (vg_replace_malloc.c:752)
==13583== by 0x5971714: _dlerror_run (in /usr/lib64/libdl-2.26.so)
==13583== by 0x5971129: dlsym (in /usr/lib64/libdl-2.26.so)
==13583== by 0x41165E: real_aio_init (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583== by 0x40849B: for__once_private (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583== by 0x4066B4: for_rtl_init_ (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583== by 0x402948: main (in /home/stevezhang/codes/code-self-teaching/f90/oop/derived_types/test)
==13583==
==13583== LEAK SUMMARY:
==13583== definitely lost: 0 bytes in 0 blocks
==13583== indirectly lost: 0 bytes in 0 blocks
==13583== possibly lost: 0 bytes in 0 blocks
==13583== still reachable: 32 bytes in 1 blocks
==13583== suppressed: 0 bytes in 0 blocks
==13583==
==13583== For counts of detected and suppressed errors, rerun with: -v
==13583== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

接下来做一些小的实验. 如果在主程序中特意省略掉destructor, 会得到104 bytes的”possibly lost”, 同时Error Summary中出现两个错误.
比较奇怪的是, 原则上当rank为2时, 2个整型和4个浮点数对应的内存损失为24 bytes.

进一步实验

  • 将rank从2增加到4, 损失增加到160 bytes. 原则上应该是80 (4整型, 16浮点数).
  • 增加另一个myarrays对象, 损失增加到208 bytes.
  • 修改destructor方法destroy_my_array, 跳过二维数组rarr2d的deallocate, 在主程序中调用destructor. 此时内存损失为56 (rank=2)和104 (rank 4) bytes.

这表明有80 bytes好像被”附着”在每个自定义类的对象上. 更具体的, 每个可分配数组”附着”了40 bytes的内存.

回看gfortran

现在回到gfortran编译上, 也是有意地去掉destructor, 看看Valgrind如何响应.

当rank=2时, Valgrind报告了24 bytes的”still reachable”泄漏, 没有报错. 这个泄漏量和根据数据类型预计的量是一样的, 与此同时Valgrind类认为这一内存泄漏是不构成关键的性能问题.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
$ valgrind --leak-check=full --show-leak-kinds=all ./test
==16808== Memcheck, a memory error detector
==16808== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16808== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==16808== Command: ./test
==16808==
==16808==
==16808== HEAP SUMMARY:
==16808== in use at exit: 24 bytes in 2 blocks
==16808== total heap usage: 23 allocs, 21 frees, 13,520 bytes allocated
==16808==
==16808== 8 bytes in 1 blocks are still reachable in loss record 1 of 2
==16808== at 0x4C2CDCB: malloc (vg_replace_malloc.c:299)
==16808== by 0x400F25: __mytypes_MOD_new_my_array (mytypes.f90:17)
==16808== by 0x40116C: MAIN__ (main.f90:8)
==16808== by 0x4011AF: main (main.f90:3)
==16808==
==16808== 16 bytes in 1 blocks are still reachable in loss record 2 of 2
==16808== at 0x4C2CDCB: malloc (vg_replace_malloc.c:299)
==16808== by 0x4010C1: __mytypes_MOD_new_my_array (mytypes.f90:20)
==16808== by 0x40116C: MAIN__ (main.f90:8)
==16808== by 0x4011AF: main (main.f90:3)
==16808==
==16808== LEAK SUMMARY:
==16808== definitely lost: 0 bytes in 0 blocks
==16808== indirectly lost: 0 bytes in 0 blocks
==16808== possibly lost: 0 bytes in 0 blocks
==16808== still reachable: 24 bytes in 2 blocks
==16808== suppressed: 0 bytes in 0 blocks
==16808==
==16808== For counts of detected and suppressed errors, rerun with: -v
==16808== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

“内存泄漏”再探

在检索上面still reachable leak问题的时候, 发现了SO上关于的一个回答

There is more than one way to define “memory leak”. In particular, there are two primary definitions of “memory leak” that are in common usage among programmers.

The first commonly used definition of “memory leak” is, “Memory was allocated and was not subsequently freed before the program terminated.” However, many programmers (rightly) argue that certain types of memory leaks that fit this definition don’t actually pose any sort of problem, and therefore should not be considered true “memory leaks”.

An arguably stricter (and more useful) definition of “memory leak” is, “Memory was allocated and cannot be subsequently freed because the program no longer has any pointers to the allocated memory block.” In other words, you cannot free memory that you no longer have any pointers to. Such memory is therefore a “memory leak”. Valgrind uses this stricter definition of the term “memory leak”. This is the type of leak which can potentially cause significant heap depletion, especially for long lived processes.

The “still reachable” category within Valgrind’s leak report refers to allocations that fit only the first definition of “memory leak”. These blocks were not freed, but they could have been freed (if the programmer had wanted to) because the program still was keeping track of pointers to those memory blocks.

In general, there is no need to worry about “still reachable” blocks. They don’t pose the sort of problem that true memory leaks can cause. For instance, there is normally no potential for heap exhaustion from “still reachable” blocks. This is because these blocks are usually one-time allocations, references to which are kept throughout the duration of the process’s lifetime. While you could go through and ensure that your program frees all allocated memory, there is usually no practical benefit from doing so since the operating system will reclaim all of the process’s memory after the process terminates, anyway. Contrast this with true memory leaks which, if left unfixed, could cause a process to run out of memory if left running long enough, or will simply cause a process to consume far more memory than is necessary.

翻译如下

定义”内存泄漏”的方式不止一种. 特别的, 在程序员间常用的主要有两种”内存泄漏”的定义.

第一种常用的定义是, “内存被分配, 随后没有在程序结束前被释放”. 但是, 很多程序员(正确地)主张说符合这一定义的内存泄漏并不会造成问题, 因此并不被认为是真正的内存泄漏.

“内存泄漏”的一种可能更为严格(也更有用)的定义是, “内存被分配后, 由于程序失去了指向被分配内存块的指针而无法被释放”. 换句话说, 你无法释放没有指针指向的内存. 所以这样的内存属于”内存泄漏”. Valgrind用的是这一更为严格的定义. 这类泄漏可能产生严重的堆损耗, 特别是在长期活动的进程中.

Valgrind的泄漏报告中”still reachable”分类指的是只满足第一类定义的内存分配. 这些内存块没有被释放, 但他们是可以被释放的(只要程序员愿意), 因为程序仍然保有指向这些内存块的指针.

一般而言, 不必担心”still reachable”的内存块. 他们不会带来真正的内存泄漏会导致的问题. 比如说, “still reachable”的内存块通常不会导致堆耗尽. 这是因为这些块都是单次分配, 程序在整个生命周期中都保留对他们的指向. 你当然可以梳理整个程序, 保证这些内存块都被释放, 但这实际并没什么好处, 因为操作系统会在进程结束后回收进程的全部内存. 与之相对, 如果真正的内存泄漏没有被修正, 那么就会导致一个进程在运行足够长时间后耗尽所有内存, 或者说消耗比它所必需的多得多的内存.

这是对之前Valgrind笔记(一)——Memcheck初探一文最后的泄漏类型梳理的重要补充. 答主非常细心的区分了两种内存泄漏的类型. 我们重新来看当时的abc程序

1
2
3
4
5
6
7
8
9
10
11
12
program abc
integer :: i
integer, allocatable :: data(:)

allocate(data(5))
print*, rank(data), size(data), loc(data)
do i = 1, 5
data(i-1) = i
end do
print*, data(1)
print*, rank(data), size(data), loc(data)
end program abc

并将data越界赋值语句注释. 用gfortran编译会得到20 bytes的definite loss. 如果用ifort, 则会得到60 bytes的possibly lost. 令人摸不着头脑的是, 如果把这一段代码放到main.f90中, 注释掉原来的myarrays的部分, 同样用gfortran编译, 得到的是20 bytes的still reachable leak. ifort仍给出60 bytes的possibly lost.

总结

从以上非常直接的例子里可以得到的两个结论, 首先是do not oversmart your compiler. 跟人类语言互译一样, 不同编译器可能将一段高级语言翻译成风格不同的机器码, 这可能就是导致Valgrind检测结果不同的原因. 其次, 也是很自然的, 既然编译器存在这样的不确定性, 那么编程人员就应该写好内存分配和释放的语句, 从源头减少这样的不确定性.

Comments